New perspectives for the Annelida collection (National Museum/UFRJ) database: using data visualization to analyze and manage biological collections

Authors

  • Camila Simões Martins de Aguiar Messias Universidade Federal do Rio de Janeiro
  • Carlos Cesar de Oliveira Fonseca Fundação Getúlio Vargas
  • Monique Cristina dos Santos Universidade Federal do Rio de Janeiro
  • Asla M. Sá Fundação Getúlio Vargas
  • Joana Zanol Universidade Federal do Rio de Janeiro

DOI:

https://doi.org/10.1590/

Keywords:

Polychaetes, Biological collection, Management, Interactive visual representations

Abstract

Collection management faces many challenges in keeping stored items preserved and the information associated with them accurate and organized. It is essential for the expansion and use of this biodiversity repository that the database is unambiguous and that errors are quickly identified and corrected. This work aims to show the use of interactive visual representations (IVRs) of the collection’s metadata as tools to inspect the data and help solve these challenges. To do this, we used the Annelida collection database from the National Museum (MN) of the Federal University of Rio de Janeiro (UFRJ). Interactive graphs of the metadata within this database (catalog date, taxonomic identification and determiners, sampling, depth, geographic localization, and collector data) were created with the Altair library in the Python 3 language. Data analyses using these graphs made it possible to identify anomalous patterns in the data and fill in missing records. They also provided an understanding of the spatial and bathymetric distribution of the specimens deposited over time, and the growth rate of the collection in each family, thus projecting future growth and solutions for the physical organization of vials. Graphs are an ally in the management of collections with digital entry forms and aim to facilitate the availability of metadata associated with cataloged specimens. Likewise, IVRs can even be used to give credit to the researchers involved in building biological collections. Thus, visualization tools are efficient in recognizing global patterns present in databases and solving biological collection management tasks.

References

Ariño, A. H. 2010. Approaches to estimating the universe of natural history collections data. Biodiversity Informatics, 7(2), 81-92.

Beaman, R. & Cellinese, N. 2012. Mass digitization of scientific collections: New opportunities to transform the use of biological specimens and underwrite biodiversity science. ZooKeys, 209, 7–17.

Blagoderov, V., Kitching, I., Livermore, L., Simonsen, T. & Smith, V. 2012. No specimen left behind: industrial scale digitization of natural history collections. ZooKeys, 209, 133–146.

Cook, J. A., Edwards, S. V., Lacey, E. A., Guralnick, R. P., Soltis, P. S., Soltis, D. E., Welch, C. K., Bell, K. C., Galbreath, K. E., Himes, C., Allen, J. M., Heath, T. A., Carnaval, A. C., Cooper, K. L., Liu, M., Hanken, J. & IckertBond, S. 2014. Natural history collections as emerging resources for innovative education. BioScience, 64(8), 725-734.

Comoglio, F., Fracchia, L. & Rinaldi, M. 2013. Bayesian Inference from Count Data Using Discrete Uniform Priors. PLoS ONE, 8(10), e74388.

Fayyad, U., Grinstein, G. G. & Wierse, A. 2001. Information visualization in data mining and knowledge discovery. Burlington, Morgan Kaufmann Publishers Graham, C., Ferrier, S., Huettman, F., Moritz, C. & Peterson, A. 2004. New developments in museum-based informatics and applications in biodiversity analysis. Trends in Ecology & Evolution, 19(9), 497–503.

Guisan, A. & Thuiller, W. 2005. Predicting species distribution: offering more than simple habitat models. Ecology Letters, 8(9), 993–1009.

He, P., Chen, J., Kong, H., Cai, L. & Qiao, G. 2021. Important Supporting Role of Biological Specimen in Biodiversity Conservation and Research. Bulletin of Chinese Academy of Sciences, 38(12), 11.

Hedrick, B. P., Heberling, J. M., Meineke, E. K., Turner, K. G., Grassa, C. J., Park, D. S., Kennedy, J., Clarke, J. A., Cook, J. A., Blackburn, D. C., Edwards, S. V. & Davis, C. C. 2020. Digitization and the Future of Natural History Collections. BioScience, 70(3), 243–251.

Hutchings, P. 1998. Biodiversity and functioning of polychaetes in benthic sediments. Biodiversity and Conservation, 7(9), 1133–1145.

Jin, J. & Yang, J. 2020. BDcleaner: A workflow for cleaning taxonomic and geographic errors in occurrence data archived in biodiversity databases. Global Ecology and Conservation, 21, e00852.

Johnson, K. R., Owens, I. F. P. & The Global Collection Group. 2023. A global approach for natural history museum collections. Science, 379(6638), 1192–1194.

Keim, D. A. 2002. Information visualization and visual data mining. IEEE Transactions on Visualization and Computer Graphics, 8(1), 1–8.

Krishtalka, L. & Humphrey, P. S. 2000. Can Natural History Museums Capture the Future? BioScience, 50(7), 611 -617.

Lana, P. C. & Bernardinho, A. F. (ed.). 2018. Brazilian Estuaries. Cham: Springer International Publishing. Liu, S., Andrienko, G., Wu, Y., Cao, N., Jiang, L., Shi, C., Wang, Y. S. & Hong, S. 2018. Steering data quality with visual analytics: The complexity challenge. Visual

Informatics, 2(4), 191–197.

Liu, S., Cui, W., Wu, Y. & Liu, M. 2014. A survey on information visualization: recent advances and challenges. The Visual Computer, 30(12), 1373–1393.

Medeiros e Sá, A., Oliveira, F. A., Schneider, B., Echavarria, K. R. & Serejo, C. S. 2022. Visually Overviewing Biodiversity Open Data Digital Collections. In: Proceedings of the Symposium on Open Data and Knowledge for a Post-Pandemic Era ODAK22, UK. Messias, C. S. M. A., Fonseca, C., Santos, M., Sá E Medeiros, A. & Zanol, J. 2023. New perspectives of Annelida collection (National Museum/UFRJ) database:

using data visualization to analyze and manage biological collections. Ocean and Coastal Research. https://doi.org/10.5281/zenodo.8092072.

Meyer, C., Weigelt, P. & Kreft, H. 2016. Multidimensional biases, gaps and uncertainties in global plant occurrence information. Ecology Letters, 19(8), 992–1006.

Miller, M. & Vielfaure, N. 2022. OpenRefine: An Approachable Open Tool to Clean Research Data. Bulletin - Association of Canadian Map Libraries and Archives), (170), 2-8.

National Academies of Sciences, Engineering and Medicine. 2020. Biological Collections: Ensuring Critical Research and Education for the 21st Century. Washington, DC, National Academies Press.

Page, L. M., Macfadden, B. J., Fortes, J. A., Soltis, P. S. & Riccardi, G. 2015. Digitization of Biodiversity Collections Reveals Biggest Data on Biodiversity. BioScience, 65(9), 841–842.

Peterson, A. T., Navarro-Sigüenza, A. G. & Pereira, R. S. 2004. Detecting errors in biodiversity data based on Annelida collection management with data visualization Ocean and Coastal Research 2024, v72(suppl 1):e24016 15 Messias et al. collectors’ itineraries. Bulletin of the British Ornithologists Club, 124, 143–151.

Ribeiro, B. R., Velazco, S. J. E., Guidoni-Martins, K., Tessarolo, G., Jardim, L., Bachman, S. P. & Loyola, R. 2022. bdc: A toolkit for standardizing, integrating and cleaning biodiversity data. Methods in Ecology and Evolution, 13(7), 1421–1428.

Rouhan, G., Dorr, L. J., Gautier, L., Clerc, P., Muller, S. & Gaudeul, M. 2017. The time has come for Natural History Collections to claim co‐authorship of research articles. TAXON, 66(5), 1014–1016.

Scott, B., Baker, E., Woodburn, M., Vincent, S., Hardy, H. & Smith, V. S. 2019. The Natural History Museum Data Portal. Database, 2019, baz038.

Shiravi, H., Shiravi, A. & Ghorbani, A. A. 2012. A Survey of Visualization Systems for Network Security. IEEE Transactions on Visualization and Computer Graphics, 18(8), 1313–1329.

Shnneiderman, B. 1996. The eyes have it: a task by data type taxonomy for information visualizations. In: Proceedings IEEE Symposium on Visual Languages (pp. 336–343). Boulder: IEEE Computer Society Press.

Suarez, A. V. & Tsutsui, N. D. 2004. The Value of Museum Collections for Research and Society. BioScience, 54(1), 66-74.

Wang, R., Perez-Riverol, Y., Hermjakob, H. & Vizcaíno, J. A. 2015. Open source libraries and frameworks for biological data visualisation: A guide for developers. PROTEOMICS, 15(8), 1356–1374.

Wilson, S. L., Way, G. P., Bittremieux, W., Armache, J., Haendel, M. A. & Hoffman, M. M. 2021. Sharing biological data: why, when, and how. FEBS Letters, 595(7), 847–863.

Xu, J., Wu, S. & Li, X. 2007. Estimating Collection Size with Logistic Regression. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 789-790). New York, ACM.

Zizka, A., Silvestro, D., Andermann, T., Azevedo, J., Duarte, C. R., Edler, D., Farooq, H., Herdean, A., Ariza, M., Scharn, R., Svantesson, S., Wengström, N., Zizka, V. & Antonelli, A. 2019. CoordinateCleaner: Standardized cleaning of occurrence records from biological collection databases. Methods in Ecology and Evolution, 10(5), 744–751.

Downloads

Published

10.04.2024

How to Cite

New perspectives for the Annelida collection (National Museum/UFRJ) database: using data visualization to analyze and manage biological collections. (2024). Ocean and Coastal Research, 72(Suppl. 1), e24016. https://doi.org/10.1590/