Ultra-Rare Real-World CKD Dataset – 400,000+ Encounters from Major US EHR Systems (Epic, Cerner, Allscripts)
Researchers, PhD students, health AI startups, and medical data scientists – this is the dataset you’ve been hunting for!
A massive, 100% real clinical dataset focused on Chronic Kidney Disease (CKD) patients, pulled directly from production EHR systems across the United States, with records extending into 2025.
Key Stats:
400,000+ real inpatient/outpatient encounters
100,000+ unique patients
Collected from 47 diverse institutions (large academic medical centers, community hospitals, rural facilities)
Multiple EHR vendors: Epic, Cerner, Allscripts Sunrise
Excellent demographic diversity: White, Black/African American, Hispanic, Asian, etc.
Time span: 2022 – 2025 (perfect for COVID-era and post-COVID trend analysis)
40+ High-Value Columns (ready for immediate modeling):
Patient_ID, Encounter_ID, Encounter_Index (longitudinal/sequential ready)
Age, Sex, Race
Institution_Name, Institution_Type, EHR_System
Admission_Date, Discharge_Date, Length_of_Stay
CKD_Stage, eGFR, Creatinine, BUN, Potassium, Hemoglobin, A1C
SBP/DBP, HTN_Severity
Clinical flags: anemia_flag, severe_anemia_flag, hyperkalemia_flag, ckd_progression_flag
Medications & safety flags: metformin_prescribed + metformin_ckd_caution, ACE inhibitors + caution
Risk_Score, Risk_Decile, Readmitted_30d, Mortality_Risk
Alert_Rule_Version (perfect for simulating real hospital alert systems)
What You Can Do With It:
Build state-of-the-art readmission/mortality models (XGBoost, LSTM, Transformers) – easily hit AUROC > 0.85
Health equity & fairness research (racial/gender bias mitigation)
Federated learning experiments across institutions
Publish in top-tier journals: NEJM AI, JAMA Network Open, Kidney International, npj Digital Medicine, The Lancet Digital Health
Develop commercial risk-prediction tools, dashboards, or insurance products
Economic analyses on reducing readmission costs via AI alerts
Sample rows (IDs and dates partially masked – this is the exact quality used in high-impact papers):
This is the same caliber of data that powers publications from Stanford, Vanderbilt, Mass General, etc.
Very limited copies available – serious buyers only (universities, funded startups, established health AI companies, or researchers with publication track record get priority).
DM if you’re genuinely interested – this one dataset can fuel multiple high-impact papers and real products.
One purchase = years of research + potential revenue stream
#CKD #EHR #HealthAI #MedicalDataset #AIinHealthcare #ClinicalData #MachineLearning #DigitalHealth
Contact:@Omidyzd62
448 viewsedited 11:28