The data was pulled from patients records charts, sample size is over 20,000. The data has so many missing data to the point that some columns are all blank, other columns have 10 values. I thought there are many reasons for missing data, either because the population of our study patients are young so they don’t have all those diseases so the docs didn’t have to add it in their records, or the people who pulled the data forgotten something. But I have nothing to do now except that I ask the people who pulled the data.
In the mean while, what else should I do to handle categorical missing data? Majority are Y/N binary data