Corresponding Author: Gerald C. Hsu, eclaireMD Foundation, USA.
This paper discusses the methodology and accuracy of a developed PPG prediction model using biomedical wave theory and signal processing techniques from geophysics and communication electronics engineering.
The author received an honorary PhD in mathematics and majored in engineering at MIT. He attended different universities over 17 years and studied seven academic disciplines including mathematics, engineering, computer science, and business administration. He also spent 17 years to self-study psychology, chronic diseases, and food nutrition. He has worked in various industries including defense, nuclear power, computer-aided-design, computer hardware design, software development, robotics, and semiconductor chip design.
He has spent 20,000 hours in type 2 diabetes (T2D) research since 2010. First, he studied six metabolic diseases and food nutrition during 2010-2013, then conducted research on metabolism during 2014. His approach is “math-physical medicine” based on mathematics, physics, engineering modeling, optical physics, signal processing, computer science, big data analytics, statistics, machine learning, and artificial intelligence (AI). This approach could provide quantitative data and precise results to interpret some biomedical phenomena and prove some biomedical findings. His main focus is on preventive medicine for chronic disease control using five prediction tools he developed during the period of 2015-2018, i.e. Weight, FPG, PPG, Adjusted Glucose, Estimated A1C. He believes that the better the prediction, the more control one would have over chronic disease.
Regardless the argument on glucose testing method’s accuracy via either lab-tested A1C or finger piercing and testing strips, the author has collected a complete set of PPG data using lab-tested A1C and finger prick testing strips plus his created lifestyle data during a period of 1,181 days with 4,029 meals (6/1/2015 – 8/25/2018). This PPG-related data set, size of ~400,000 data, is only a small portion of his entire ~1.5 million data. Furthermore, since 5/5/2018, he has launched an effort to collect around 9,000 glucose data for 112 days (about 80 glucose measurements per day) from a Libre sensor on his upper arm. This new dataset has provided more information regarding glucose waves, especially in the arena of waveform details and “total energy and intensity associated with glucose waveforms”.
Due to his mathematics and engineering background, he views these data curves related to biomedical conditions and lifestyle management as a collection of various nonlinear input and output signal waves of the human body. At first, he applied “Finite Element Method” of engineering modeling technique to convert this “analog” human system into a “digitized” mathematical system in order to get an approximate solution of the real human biomedical conditions.
He sees each digitized sub-wave as representing a single-source created contribution component of the final combined PPG wave. Therefore, he applied wave theory and signal processing techniques to decompose this measured PPG signal into around 20 single-sourced sub-waves with about 10 significant sub-waves. He carefully checked each sub-signal waveform for its completeness, accuracy, and correlation with other curves, using statistical tools, such as time-series, spatial, and frequency domain analyses, etc.
Over the past three years, he continuously explored and added some missing influential factors into the formation of the PPG signal. His purpose was trying to improve the predicted PPG waveform’s contents and accuracy while maintaining high correlation with the measured PPG waveform.
For example, by the fall of 2016, the accuracy of his predicted PPG reached ~95%. In September of 2017, he identified that ambient temperature (weather condition) also had an impact on glucose value. Therefore, he selected an initial period of 2-years (6/2015 – 7/2017) to examine his travel schedule in detail and also entered each day’s local ambient temperature of the city where he stayed (he travels every 14 days during the past 8 years). In this way, he was able to add in a new contribution from temperature sub-wave which brought the accuracy of the predicted PPG from ~95% to ~98%.
Another factor was that his glucose was quite high when he was
sick with flu for a month at the end of 2017. It should be pointed out that the author has not had cold or flu over the past 25 years until December of 2017. After that experience, he further enhanced his prediction model with the inclusion of “physical sickness or wellbeing” which finally brought the prediction accuracy to 99.9%.
The author used his measured data as the base for data comparison. He has safeguarded the integrity of his data and has never altered its original content or influenced its integrity since the sole motivation of his research is to save his own life from diabetes complications; not for fame, power, or money.
All data was collected in its entirety from one patient only (himself), via a customized software program, over an extended period of time. Therefore, the author needed very little “data cleaning” before starting his Ext steps of research work, which included data validation, numerical analysis, and biomedical interpretation. This project does not have to be concerned with problems, such as data interference and data contamination, due to different genetic conditions, various lifestyles, and contradicting interpretations. These data come from a consistent sample source, making it much easier for the author to dive into one variable and extract the deeply buried information.
After analyzing each sub-wave in detail, he was ready to reintegrate these sub-waveforms into another nonlinear predicted PPG waveform. During this stage of analysis, using Libre continuous glucose monitoring device, the author further investigated some additional characters which are not available via traditional finger-piercing method, such as peak values. comparison, glucose rising and declining speeds, total energy and intensity associated with glucose waveforms, etc. These extra data and information will be extremely useful in his follow-on research work of relationship between diabetes conditions and other internal organ complications.
He further improved his model via a “curve-fitting” trial-and-error engineering method which he learned in his defense working career. He has continuously compared these two sets of data and improved the accuracy until it reached a very high linear accuracy while still maintaining high correlation between two primary variables. High correlation means the trend of predicted curve moves along with the measured curve like its “twin”.
For hemoglobin A1C estimation, he specifically added in a “safety margin” on top of his estimated A1C values calculated from both predicted values of FPG and PPG. This “safety margin” concept was learned from his nuclear power working career. During the period of 2012-2015, he added in a safety margin of 15% and after 2015 when his glucose was under controlled below 120 mg/dL, he reduced his safety margin from 15% down to 5%. This averaged 7% to 10% extra “safety margin” on top of his originally predicted A1C value is based on the consideration of providing a numerical buffer which can serve as an “early warning” to T2D patients. Both the Adjusted Glucose and Estimated A1C models also utilized “self-learning and self-adjusting” type of machine-learning algorithms in order to correct or compensate for the built-in “errors” from the testing process of lab tests and finger-piercing glucose measurements, such as chemical, environmental, and operational variances.
As shown in Figure 1, during the period of 1,181 days (6/1/2015 – 8/25/2018), average PPG values are:
- Predicted: 119.53mg/dL
- Measured: 119.59 mg/dL
- with 99.9% linear accuracy and a high correlation of 84.4%.
Figure 1: Predicted vs. Measured PPG Correlation between 2 influential factors and PPG
It should be noted that an overlapping period of 1,059 days (10/1/2015 – 8/25/2018) was used for calculating the 90-days moving average for easy viewing of the PPG trend (similar to the concept of “dynamic daily A1C”). The first 90-120 days data were not used in calculation due to the consideration of data stability.
As shown in Figures 1, 2 & 3, the PPG’s major influential factors’ contribution on daily PPG value and their individual contribution margins are as follows:
- Carbs/Sugar: +14.3 mg/dL, 38%
- Post-meal walking: -15.5 mg/dL, 41%
- Temperature: +3.7 mg/dL, 10%
- All others: +1.95 mg/dL, 11%
- Net gain on PPG: +4.3 mg/dL
As shown in Figure 1, Correlation coefficients between major influential factors and measured PPG (119 mg/dL) are:
- Carbs/Sugar (14.3 gram): +57% (high positive value means higher intake of carbs/sugar pushes PPG higher)
- Post-meal walking exercise (4,172 steps): -80% (high negative value means higher amount of exercise brings PPG lower)
As shown in Figure 3, temperature impact on PPG is obvious, especially in warmer weather >77°F. This PPG value would increase 0.9 mg/dL due to temperature increase of each degree above 77°F. This phenomenon is due to increased energy demand and metabolism creation. In contrast, FPG value would decrease 0.3 mg/dL due to temperature decrease of each degree below 67°F. This phenomenon is due to “hibernation” effect on human body.
Figure 2: Decomposition of 4 Major Sub-Waveforms of PPG
Figure 3: Ambient Temperature (Weather) contribution to PPG; 3-years Residence Temperature Record
As shown in Figure 4, for an overweight patient (BMI 25 – 30), the correlation coefficient between PPG and Weight is a low 19% in time-series analysis. In spatial analysis diagram, his PPG values stay within a “constant band” regardless of his weight fluctuation. Actually, in this spatial analysis using 2,427 days data (1/1/2012-8/25/2018), the PPG variance band between 103 mg/dL (-20%) and 155 mg/dL (+20%) with an averaged PPG value of 128.8 mg/dL covers 86% of total PPG data regardless weight bouncing between 166 lbs and 196 lbs. These two diagrams prove that PPG is not influenced by Weight. Also shown in the same Figure 4, the correlation coefficient between PPG and FPG is a mere 7% which means they are not related at all.
Figure 4: Low correlation existed between PPG vs. FPG and PPG vs. Weight
As a result, both FPG and weight have no relationship and influence on PPG. On the other hand, as described in other papers written by the same author, weight is the dominating factor of FPG creation. Weight is directly proportional to the total quantity of food consumption while PPG is directly related to food quality, specifically the intake amount of carbs and sugar. Of course, a person who eats a large quantity of food will likely take in more carbs and sugar. However, a knowledgeable and well-disciplined diabetic patient can control both quantity and quality of food. Therefore, both of his FPG and PPG can then be controlled. It should be noted that the above conclusion should be re-verified for light-weight and obese patients. Strict weight reduction will be a very effective way for an obese patient to push his/her glucose (both FPG and PPG) values downward. However, for over-weight patient’s, their knowledge of major influential factors of glucose will be extremely beneficial on their diabetes control.
The quantitative results from the developed PPG prediction model reflect the accuracy and applicability for type 2 diabetes control via a guided lifestyle management. The utilization of wave theory and signal processing techniques are also proven quite effective for prediction and control of PPG. As shown in Figure 5, Health Data Comparison Between 2010 and 2017, the author’s health condition has been improved significantly due to the control of his glucose, especially when PPG has about 70% to 80% of contribution to hemoglobin A1C formation.
Figure 5: Health Data Comparison between 2010 and 2017
The author received an honorable PhD in mathematics and majored in engineering at MIT. He attended different universities over 17 years and studied 7 academic disciplines. He has spent 20,000 hours in T2D research. First, he studied 6 metabolic diseases and food nutrition during 2010-2013, then conducted his own diabetes research during 2014-2018. His approach is a “quantitative medicine” based on mathematics, physics, optical and electronics physics, engineering modeling, signal processing, computer science, big data analytics, statistics, machine learning, and AI. He name it as a “math-physical medicine”. His main focus is on preventive medicine using prediction tools. He believes that the better the prediction, the more control you have.
- Hsu, Gerald C. (2018). Using Math-Physical Medicine to Control T2D via Metabolism Monitoring and Glucose Predictions. Journal of Endocrinology and Diabetes, 1(1), 1-6. Retrieved from http://www.kosmospublishers.com/wp-content/uploads/ 2018/06/JEAD-101-1.pdf
- Hsu, Gerald C. (2018, June). Using Math-Physical Medicine to Analyze Metabolism and Improve Health Conditions. Video presented at the meeting of the 3rd International Conference on Endocrinology and Metabolic Syndrome 2018, Amsterdam, Netherlands.
- 3. Hsu, Gerald C. (2018). Using Math-Physical Medicine and Artificial Intelligence Technology to Manage Lifestyle and Control Metabolic Conditions of T2D. International Journal of Diabetes & Its Complications, 2(3),1-7. Retrieved from http://cmepub.com/pdfs/using-mathphysical-medicine-and-artificial-intelligence-technology-to-manage-lifestyle-and-control-metabolic-conditions-of-t2d-412.pdf
The author created this “math-physical medicine” approach by himself in order to save his own life. Although he has read many medical books, journals, articles, and papers, he did not specifically utilize any data or methodology from other medical work. All of his research is his original work based on data he collected from his body and using his own computer software developed during the past 8-years. Therefore, no major problems were associated with data interference or data contamination since he has been dealing with a homogenous genetic condition and lifestyle environment. He could dig one single variable very deeply to extract valuable information. In addition, his knowledge, information, technique, and methodology of mathematics, physics, engineering, and computer science came from his lifelong learning from schools and industries and should not be listed as medical references. This is the reason his references only contain his own published papers.
Limitation of Research
This article is based on data of metabolic conditions and lifestyle details collected from one T2D patient (himself). It does not cover genetic conditions and lifestyle details of other diabetes patients. However, the author’s research approach is based on his solid inter-disciplinary academic background and successful industrial experiences. His academic background and working experience have prepared him to conduct his diabetes research with the following thorough process and carefully chosen steps:
- Observing and identifying a system’s basic characters as a physicist;
- Developing related but rigorous mathematical equations as a mathematician;
- Applying suitable engineering models and useful statistical models to address the real-world challenges, e.g. data variance, as an engineer;
- Using modern computer science tools and sophisticated AI techniques to aid in problem solving.
Nevertheless, his conclusions and findings should be re-verified and proceed with caution when applying to other patients who are under different metabolic conditions or lifestyles.
During the past 8 years of self-study and research, the author has never hired any research assistant or research associate to help with his research work. He applied his own invention of a “Software Robot” created during 2001-2009, his AI knowledge, and his previous around 1 million lines of code programming experience to produce the architecture, system structure, and part of specially designed code of this customized computer software. He used this software to collect and analyze his big data, conduct his medical research, and to control his diabetes disease.
This project was 100% self-funded by using his own money that was earned from a successful high-tech venture in Silicon Valley. He did not receive any financial assistance or grants from any public or private institution, agency, or organization. Therefore, there are no concerns regarding any conflict of interest.
First and foremost, the author wishes to express his sincere appreciation to a very important person in his life, Professor Norman Jones at MIT and University of Liverpool. Not only did he give him the opportunity to study for his PhD at MIT, but he also trained him extensively on how to solve difficult problems and conduct any basic scientific research with a big vision, pure heart, integrity, and dedication.
The author would also like to thank Professor James Andrews at the University of Iowa. He helped and supported him tremendously when he first came to the United States. His encouragement assisted him to build his solid engineering and computer science foundation. He is forever grateful to his earlier mentor, who has a kind heart and guided him during his struggling of his undergraduate and master’s degree work.
Finally but not least, he would like to extend his appreciation to two medical doctors associated with Stanford University and its medical clinics at different period. Both Dr, Jamie Nuwer and Dr. Jeffrey Guardino provided encouragement to him on his continuous diabetes research work and paper publishing.