This repo is to parse XML files. The XML files were the publicly available patient's clinical features downloaded from GDC portal for TCGA's cancer project.
Parsing big XML files in Python is hard. On one hand, regular XML libraries load the whole file into memory, which will crash the process if the file is too big. Other solutions such as iterparse do ...