IDCC19 Melbourne, Australia
The International Digital Curation Conference (IDCC) was held in Melbourne Australia from February 6th-8th, 2019. The intended audience of the conference is anyone who is interested in curating digital materials for posterity. Personally, I had never heard of the conference but was found it intriguing as I am a PhD candidate working with historical Hawaiian language text documents.
Thanks to Springer Nature and the American Association of University Women-Hawaii Chapter I was able to attend and present at IDCC19. Honestly, this conference was like nothing else that I have experienced and was certainly beneficial to my career due to the presentations, workshops, and networking events. One amazing item about this conference was that everyone was so willing to share their information. For instance, check out the program that includes links to collaborative notes
In the remainder of this blog post, I will highlight some of my favorite presentations from the conference.
February 4th - Digital Preservation Carpentry
I attended a pre-conference carpentry style workshop on Digital Preservation. Being a newcomer to digital preservation, I had no clue what to expect. By the end of the session, I definitely got my monies worth. In the session, we did a hands-on lesson about an open-source tool called bagit. Bagit is used to create standardized sets of digital information prior to archival ingest (this phrase is very "archival" but here is a link to a diagram about the archival process that really helped me during the workshop). The Bagit program is great for ensuring the full transfer of digital materials into your archiving system #checksums #allthebits. The second half of the workshop was dedicated to digital archiving workflows. This hands-on activity really helped to showcase the ambiguity of workflows and that there is definitely no single workflow for any situation. From my perspective, it is great to have a theoretical workflow but as an archivist, you must be flexible with the workflow and be respectful of the stakeholders and the information associated with the process.
The session on Grand Curation Challenges Across Disciplines was quite interesting and seemed to boil down to the idea that there is a large upfront investment in the development of data from a wide range of disciplines that we need to curate with the hope of the data being reused. However, the challenges include actually curating and reusing the data. One of the major challenges is address at what point of the "data lifecycle" should data be collected and curated, there is also the lack of version control for data that is produced by complex machinery. A theme throughout the conference that resonated in this session was about metadata, developing descriptive tags or information about the data.
Another presentation that I enjoyed was by Carolyn Hank on the "Dead, Dormant, Zoetic: Modeling the Blog Lifecycle". What this reminded me about was digital traces...do they ever disappear? Prior to social networking sites, blogs were the way to express oneself online. Carolyn Hank has collected over a decade of research on blogs and has seen their lifecycle including the resurrection of blogs after many years. The questions she is now addressing in her research is how and why do blogs go dormant and/or resurrect oneself? I could only imagine that with all of the policy changes associated with social networking site platforms and their data use, many blogs may be resurrected.
How do you work with digital data reuse especially in terms of data for publication?
Well, Ryan Stoker and Jen McLean @jennymcmac presented a paper on leveraging the Cultural Competency Framework from biomedical researchers and applying it to data reuse. The one item that they showcased in their presentation that I did not previously know about were Traditional Knowledge labels
The final keynote by Dr. Patricia Brennan summed up the conference very nicely in that there is a need to curate digital data in order to be used for future studies. Machine learning techniques and natural language processing have a lot of potential to improve data curation and re-use, but it takes a lot of knowledge and cyberinfrastructure necessary to develop those technologies. If machine learning technologies are used we need to be wary of #algorithmicbias which Dr. Sayeed Choudhury spoke about it in his presentation on the DCC Curation Lifecycle Model 2.0