Gathering the Data
Welcome back to my blog! The second week of my internship at Mayo Clinic has been very eventful. I have settled into the bioinformatics lab and met many wonderful people. I finally received my mayo clinic staff badge as well, and discovered the staff cafe, which serves really good food.
I have been mainly working with Sikh, a student at ASU. He is really nice and has been an invaluable resource for me. Using a software called Notion, we gathered a dataset of patients from the Mayo Clinic database, anonymized them, and fetched their MRI scans. Each scan captures numerous cross-sections of the torso from the transverse plane, and when combined, they form a 3-D representation. In the interest of preserving patient identity, I cannot reveal the actual MRI scans I will be working with until much later in the process, but they look something like this.
However, we also faced many unexpected challenges in the process of preparing the data. Some of the studies had corrupted or non-existent data, and others had a mismatched Medical Record Number (mrn), an identifying value we used to search the database. For example, one study might have an extra digit that marked it's origin as the Mayo Clinic facility in Rochester, but we would later discover that the study was actually located in the database of an Arizona facility. This was especially annoying as it forced us to re-run several queries.
Moreover, the Notion software we used was extremely unreliable. Due to the large quantities of data it managed, it easily overloaded, causing our queries to crash and fail. We tried to avoid this by working at a slower pace, but we still had to re-fetch several studies because of Notion crashes. In time, we were able to achieve the "completed" status on most of our studies, with the exception of those with corrupted or non-existent data.
Now that the data is downloaded, we will examine it through a software called ImageJ in the following weeks. Dr. Kehler will be helping us interpret the results and mark certain regions of interest in the cancerous liver tissue for texture analysis. We hope to discover characteristic image features of HCC that correlate with the recent findings in pathology.
I have been mainly working with Sikh, a student at ASU. He is really nice and has been an invaluable resource for me. Using a software called Notion, we gathered a dataset of patients from the Mayo Clinic database, anonymized them, and fetched their MRI scans. Each scan captures numerous cross-sections of the torso from the transverse plane, and when combined, they form a 3-D representation. In the interest of preserving patient identity, I cannot reveal the actual MRI scans I will be working with until much later in the process, but they look something like this.
However, we also faced many unexpected challenges in the process of preparing the data. Some of the studies had corrupted or non-existent data, and others had a mismatched Medical Record Number (mrn), an identifying value we used to search the database. For example, one study might have an extra digit that marked it's origin as the Mayo Clinic facility in Rochester, but we would later discover that the study was actually located in the database of an Arizona facility. This was especially annoying as it forced us to re-run several queries.
Moreover, the Notion software we used was extremely unreliable. Due to the large quantities of data it managed, it easily overloaded, causing our queries to crash and fail. We tried to avoid this by working at a slower pace, but we still had to re-fetch several studies because of Notion crashes. In time, we were able to achieve the "completed" status on most of our studies, with the exception of those with corrupted or non-existent data.
Now that the data is downloaded, we will examine it through a software called ImageJ in the following weeks. Dr. Kehler will be helping us interpret the results and mark certain regions of interest in the cancerous liver tissue for texture analysis. We hope to discover characteristic image features of HCC that correlate with the recent findings in pathology.
Thanks for an excellent post this week. MRI's are more interesting than I realized, I always thought an MRI created pictures. Notion definitely appears to be buggy, but at least you found a way around it. Thanks for explaining all these concepts, it's easier to follow along without having to search up these terms. What will ImageJ do for you? I understand that it'll help you examine the data, but what will it show that the human eye can't?
ReplyDeleteInteresting post. I'm glad to see that you are making progress in your post. I can't wait to read how you analyze the data.
ReplyDeleteThank you! I hope to explain all that in my next few blog posts.
DeleteIt's good to see that you're making progress. Are you gonna change your software?
ReplyDeleteNotion is the only software that is really integrated with the Mayo database, but fortunately I won't need to use it again for a while because I finished loading the data I need.
DeleteThis comment has been removed by the author.
ReplyDeleteYour new post is interesting! I am happy you have managed to get through all the problems during your week. Can you explain what the ImageJ software does?
ReplyDeleteHi! I'm glad you are settling in to your project! What can Notion show you that ImageJ can't? Are there any other softwares that you are going to need to use for the rest of the project? I'm looking forward to next weeks post.
ReplyDeleteNotion is used to gather the data, and ImageJ will be used for texture analysis on the images to identify visible and sub-visible features.
DeleteReally interesting post, you sound like you are learning alot! How do you get around Notion's unreliability, or do you just have to deal with it?
ReplyDeleteYeah i kinda had to deal with it by going a lot slower to avoid overloading the program. Fortunately, I' done processing the data, so I won't need to use notion again anytime soon.
DeleteIt was very fascinating to hear about the different ways you went about obtaining the data. Did the crashing and reloading stop the process significantly, or were you able to get the data in a reasonable time span?
ReplyDeleteAlthough notion was a bit buggy, I was able to process most of my data after a while after multiple attempts.
DeleteThat is unfortunate that Notion kept on crashing and that there were so many problems this week. What will you be using to mark the certain areas of cancerous liver tissue?
ReplyDeleteI will be using another program called ImageJ to view the mri scans and mark regions of interest for further analysis.
DeleteThis is a very interesting post! It's amazing to see al the progress by week 2! It was also very fascinating to see the different ways of obtaining data
ReplyDeleteGreat stuff. Is there better software available? Notion sounds pretty buggy to me. Do you think that you will pick up tricks on how distinguish the new type of HCC?
ReplyDeleteUnfortunately, I'm stuck with notion, as it is one of the few softwares integrated with the Mayo database. I think they coded much of it themselves too. I hope to begin texture analysis soon, which will allow me to begin distinguishing HCC types.
DeleteYou are making great progress fast! Looks like a lot of fun. Were there any challenges u weren't able to overcome?
ReplyDeleteAlthough there were a few obstacles, I was able to work around them. The main problem was my lack of familiarity with the software, but I'm making lots of progress in that area.
DeleteHi Richard, I'm happy to see you are progressing in your research. With the disorganization of the MRI scans, do you plan to propose a plan to fix that portion and thus improving the quality of feedback in MRI scans?
ReplyDelete