Healthcare Big Data – Part 2a

Big Data 3Vs cardboard-box-iconIn the post Healthcare Big Data – Part 2 (this link will open in a new tab of your current browser window), I wrote that no matter the size of Healthcare Big Data, a known fact of the current state of healthcare industry worldwide which is in general afflicted with poorly coordinated care, fraud and abuse and administrative and clinical efficiency, the goal is ultimately to improve patient care and reduce costs.

In this post I like to share with you this infographic below (click on the image of the infographic below to view a larger image which will first open  in a new tab of your current browser window, click again on the image in this new tab which will then show you a full view of the infographic in the same tab) which I think rightly supplements what I wrote in the post mentioned above.

This infographic visualises the worldwide trend to digitize healthcare patient information from paper-based medical records to Electronic Medical Records. This trend continues to gather increasing momentum to produce infinite volumes of Big Data, an estimated 50 pentabytes of data in the healthcare realm. This influx of Big Data will create more jobs to handle all these data, especially new jobs that demand new talent in analytics,

This infographic also visualises the bulk of the internal source of Healthcare Big Data as originated by medical providers and ancillary services providers during the course of providing their services. More Big Data is accumulated when these internal data source is in turn used for insurance claims and payments, to a greater extent In advanced economies and lesser in less advanced economies. The technology vendors provide the technology interface for the internal source of Healthcare Big Data.

Then there is the external and public as well as private storage of Healthcare Big Data. Public Health agencies also generate Healthcare Big Data mandated by legislation and regulations e.g. immunisation and cancers data, and store them in data repositories. Third-party organisations also generate Healthcare Big Data when they coordinate between healthcare providers. Private data are also stored in remotely stored and web-based repositories when some consumers maintain personal (private) health records online.

From this infographic, patient care is improved when streaming data is used to decrease patient mortality as these data moves in healthcare. However the bigger challenge is to harness the 80% of all the unstructured data of patient information in Healthcare Big Data.

When it comes to healthcare Big Data is a Big Deal

Infographic credit: healthcareitconnect.com/

I shall discuss the ways of Big Data which will transform healthcare, in the near future with cost savings, quality of care, and care coordination after I have blogged about Big Data solutions in a future post.

Healthcare Big Data – Part 2

Big Data 3Vs cardboard-box-iconIn this second instalment of Healthcare Big Data, let’s look at the multiple sources of data that are responsible for Healthcare Big Data.

The internal data found in existing paper-based medical records is one large source of Healthcare Big Data.

With more and more hospitals in the health care industry around the world turning to creating digital representations of existing data in paper-based medical records and acquiring everything that is new in the form of Electronic Medical Records, there is an infinite data growth rate in this internal data source.

Then there is also Big Data from other sources, those from external, private, and public sources.

The discovery process, both oral and written discovery initiated by the legal profession outside the healthcare industry which adds terabytes or even petabytes of information is one source of external Healthcare Big Data, when individual doctors, hospitals, and medical practice groups become defendants in malpractice lawsuits.

No matter the size of Healthcare Big Data, a known fact in healthcare is to improve patient care and reduce costs.

Thus to improve patient care and reduce costs through Healthcare Big Data, one of the biggest challenges for most healthcare organisations is to mine the data or dig for something of value from these multiple sources of data. Healthcare organisations must find i.e locate the appropriate data, identify useful data i.e determine whether the data set is appropriate for use,  and aggregate all of the Big Data from the multiple sources and push through an analytics platform as part of their analytics processes.

Since I am running a blog for the general benefit of Health Information Management (HIM) / Medical Records (MR) practitioners, I shall not be diving deeper into big data sources, to avoid driving readers into the IT realm nor writing on the business analytics (BA) and business intelligence (BI) processes to determine how large-scale data sets can be used. I must say that all the posts on Big Data I have published on this website-blog , including this one is to facilitate HIM / MR practitioners to have a rudimentary understanding of Big Data.

Now that HIM / MR practitioner readers  know that Big Data is out there, Frank (2013) states that “analytics is part science, part investigative work, and part assumption.” The idea is to capture as much as data the healthcare organisation deals with, so all of any data are located, included and gathered from as many data sources as possible so that the more data there will be to work with and bring all of these data into an analytics platform.

While the healthcare organisation locates, includes and gathers from as many data sources, healthcare organisations will find a vast wealth of external public information. This external data makes up the public portion of Big Data. This includes customer sentiments from research companies and social networking sites e.g Twitter, Facebook to geopolitical issues e.g. weather information and traffic pattern information, from government entities, e.g census data, and a multitude of other sources.

In the next instalment, I shall gather more information on how the multitude of sources of Healthcare Big Data must be integrated and managed to set priorities so that Big Data solutions could analyse and get the results into the right hands to improve patient care and reduce costs.

Resources:

  1. Frank JO 2013, Big Data Analytics: Turning Big Data into Big Money, Wiley and SAS Business Series, John Wiley & Sons, Inc, New Jersey, USA

Healthcare Big Data – Part 1

Big Data 3Vs cardboard-box-iconTo continue from from the introductory post Big Data – Introduction (this link will open in a new tab of your current browser window) and Big Data – Big Data Basics (this link will open in a new tab of your current browser window), this first part will introduce the subject of big data in healthcare and end there.

As you would surely be aware even as a Health Information Management (HIM) / Medical Records (MR) practitioner from your practice of managing medical records that an individual patient’s clinical signs and symptoms, medical and family history, and data from laboratory and imaging evaluation found in his or her medical record is used by the attending doctor to diagnose and then treat the patient’s illnesses. This traditional clinical diagnosis and management approach to treatment has been and still is often a reactive approach, i.e., the doctor starts treatment/medication after the signs and symptoms appear.

However given the genetic variability between individuals and advances in medical genetics and human genetics eversince the Human genome project completed in 2003, medical genetics and human genetics have since provided both scientists and clinicians to understand health and manage disease, that is to say that it has been providing a more detailed understanding of the roles of genes in normal human development and physiology and the risk for many common diseases, not in the same way diseases have been understood in the traditional reactive approach.

Standard test data – of an individual patient’s clinical signs and symptoms, medical and family history, patient discharges, real-time clinical transactions and data from laboratory and imaging evaluation found in his or her medical record, and data in personalised medicine or PM when medical decisions, practices, and/or products are tailored or customised to the individual patient with the use of genetic information (genomic data) – from the study of biological data of the complete set of DNA within a single cell of an organism of the individual, or a combination of the two, creates vast collections of data – Healthcare Big Data..

Healthcare Big Data has tremendous potential to add value from analysing and mining these vast collections of data now available to hospitals in general.

But Healthcare Big Data must be managed, leveraged and integrated to help personalise care (as in PM), engage patients, reduce variability and costs, and ultimately improve quality.

In order to manage, leverage and integrate Healthcare Big Data, Big Data solutions are needed to transform health care with big data. Big Data solutions apply analytics to examine, better analyse and understand the large amounts of data of unstructured clinical data in the form of images, scanned documents, and encounter or progress notes in its native state, integrate it with operational structured data based on historical and current trends, to uncover whatever hidden patterns, unknown correlations and other useful information, and then they help predict what might occur in the future with a trusted level of greater reliability. Healthcare Big Data analytics is all useful information because such information can provide competitive advantages over rival hospital organisations and result in business benefits, for example more effective marketing and thus generate increased revenue.

In the next post on Healthcare Big Data, I shall be blogging about the challenges in aggregating the Healthcare Big Data from multiple sources.

References:

  1. Denise, A 2013, Leveraging big data analytics to improve healthcare delivery, ZDNet,  viewed 30 March 2013, < http://www.zdnet.com/leveraging-big-data-analytics-to-improve-healthcare-delivery-7000013072/ >
  2. Geoffrey, SG and Huntington, FW (eds.) 2010, Essentials of Genomic and Personalized Medicine, Academic Press, Elsevier Inc, San Diego, CA, USA
  3. Lorraine, F, Michele, O’C,  & Victoria, W 2012, Data, Bigger Outcomes, American Health Information Management Association, viewed 18 November 2012, < http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_049741.hcsp?dDocName=bok1_049741 >
  4. Margaret, R, 2012,  DEFINITION big data analytics, TechTarget, viewed 1 April 2013, < http://searchbusinessanalytics.techtarget.com/definition/big-data-analytics >
  5. Neil, V 2013, Big Data Use In Healthcare Needs Governance, Education, InformationWeek, viewed 30 March 2013, < http://www.informationweek.com/healthcare/clinical-systems/big-data-use-in-healthcare-needs-governa/240151395 >

Big Data – Big Data Basics

Big Data 3Vs cardboard-box-icon

This post is to continue from the introductory post Big Data – Introduction (this link will open in a new tab of your current browser window) on Big Data about the “3Vs” that define Big Data. As I researched the subject of Big Data, three terms – Volume, Velocity and Variety stood out in relation to the “3Vs” of Big Data which leads me to explain to you in this post the widely accepted definition of Big Data from Gartner (the world’s leading information technology research and advisory company) analyst Doug Laney who has characterised Big Data as “data that’s an order of magnitude greater than data you’re accustomed to.”

Accordingly, this “3Vs” model for describing Big Data spans three dimensions, data increasing in volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources).

The first dimension/characteristic, Volume is about how Ed Dumbill, program chair for the O’Reilly Strata Conference (the leading event that offers the nuts-and-bolts of building a data-driven business – the latest on the skills, tools, and technologies you need to make data work and bringing together practitioners, researchers, IT leaders and entrepreneurs to discuss big data, Hadoop, analytics, visualisation and data markets –  the people and technology driving the data revolution), describes Big Data as “data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.”

To give you an idea of the volume of data that is increasing exponentially on an annual basis, customer transactions at Walmart is reported to estimate to more than 2.5 petabytes of data every hour. Perhaps these infographics, courtesy of the online storage site Mozy, and Cisco will help you visualise the meaning of pentabytes of data and how it expands further into zettabytes sometime into the future.

Visualizing The Pentabyte Age

Infographic credit : http://mozy.com/blog/misc/how-much-is-a-petabyte/

The Internet in 2015

Infographic credit : http://blogs.cisco.com/news/the-dawn-of-the-zettabyte-era-infographic/

Velocity, the second dimension/characteristic describes the frequency at which data is generated, captured and shared in every imaginable device that all produce torrents of data.

I am sure you have heard of a batch process that takes a chunk of data, submits a job to the server and waits for delivery of the result. In a batch process, the incoming data rate is slower than the batch processing rate but the result is useful despite the delay. For many new applications sources of data, the batch process is just not possible anymore since the speed of data creation is even more important  than the volume. The data is now real-time or nearly real-time  information streaming into the server in a continuous fashion.

The available data in the world today comes from everywhere, this Variety, the third dimension/characteristic signifies the proliferation of data types that add new data types  which no longer fits into neat, easy to consume structures of traditional transactional data, all of which exists as a by-product of ordinary  operations: those being generated by humans from posts to social media sites, digital pictures and videos, purchase transaction records, and GPS signals from cell phones, and from “sensor” data generated from computers and network devices and embedded chips used to gather climate information, from refrigerators and airplanes to bodily implants, and more.

The International Business Machines Corporation (IBM) adds Veracity as the fourth dimension of Big Data. Veracity is when the confidence of the quality (precision and accuracy) of the variety and number of information sources is doubted.

I guess this is enough to known briefly about the basics of Big Data.

References:
About 2012, O’Reilly Strata Conference, viewed 13 December 2012, < http://strataconf.com/strata2012/public/content/about >

Andrew, M & Erik, B 2012, Big Data: The Management Revolution, Harvard Business Review October 2012, Boston, MA, USA

Dave, F 2012, The 3 I’s Of Big Data, Forbes, viewed 13 December 2012,
< http://www.forbes.com/sites/davefeinleib/2012/07/09/the-3-is-of-big-data/ >

Diya, S 2012, The 3Vs that define Big Data, Data Science Central, viewed 13 December 2012, < http://www.datasciencecentral.com/forum/topics/the-3vs-that-define-big-data >

Lorraine, F, Michele, O’C,  & Victoria, W 2012, Data, Bigger Outcomes, American Health Information Management Association, viewed 18 November 2012,
< http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_049741.hcsp?dDocName=bok1_049741 >

Stefan, S 2012, The 3 V of BIG Data, Agile Commerce, viewed 13 December 2012,
< http://multichannel-retailing.com/2012/05/the-3-v-of-big-data/ >

What is big data? 2012, International Business Machines Corporation (IBM), viewed 18 November 2012, < http://www-01.ibm.com/software/data/bigdata/ >

Big Data – Introduction

There is a lot of buzzes and there is a lot of emerging hype on Big Data, and I like to begin a series of posts on everything, well almost everything that I can capture, store, search, share, analyse and visualise there is to know about Big Data. I think I will by no means have discussed everything there is to know about Big Data in a running series of posts in this blog as I think Big Data is so huge that I can start an entire blog devoted to just this.

As this blog is about healthcare data and information issues,  the issue of Big Data in healthcare is that there’s a tremendous amount of data and information about the patient. I wish to think it is organised, but the real issue is, that it isn’t organised as well as it should be and all that data is a mixture of structured and unstructured data. Here is when I like to agree with Joe Petro, senior vice president of healthcare research and development at Nuance Communications who sums up the current state of big data. Petro believes that there’s a tremendous amount of information when you’re in the institution – it is a big data problem, you’re trying to figure out what’s going on and how to report on something and you’re dying of thirst in a sea of information, and the issue is how to tap into that to make sense of what’s going on.

Big Data is everywhere, not just in healthcare but as well in as many other sectors of the global economy.

WHERE HAS BIG DATA IN HEALTHCARE COME FROM?

The separation of data among hospital systems – clinical components, laboratories, and radiology are all separate repositories for information. The main issue is with leveraging all of these data. Their use is to provide clinical care or provide scheduling information or operational information. Often there is a problem if we want systems to talk to each other. An organisation can also end up with redundant information due to a legacy system, a system we may continue to use, sometimes well past its vendor-supported lifetime, resulting in support and maintenance challenges. It may be that the system still provides for the users’ needs, even though newer technology or more efficient methods of performing a task are now available.

WHAT IS BIG DATA, IN A NUTSHELL?

Big Data thus refers to sets of data that are so large, that they become awkward and complex that traditional database management tools struggle to capture, store, analyse and share this information. Difficulties include capture, storage, search, sharing, analytics, and visualizing.

IBM (IBM 2012) claims that “every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few.” IBM adds that “This data is big data.”

Big data spans three dimensions, sometimes referred to as the 3 “Vs”: Volume, Velocity, and Variety. But IBM says Big Data spans four dimensions: Volume, Velocity, Variety, and Veracity.

Image credit : : http://www.asigra.com

I shall leave this post for now with this infographic(Click on the image above to view the image in a new tab of your current window and in order to obtain a larger image or a closer view of the image in this new tab, zoom in) and continue with the “Vs” and Big Data Basics in the next post on Big Data.

References:
Lorraine, F, Michele, O’C,  & Victoria, W 2012, Data, Bigger Outcomes, American Health Information Management Association, viewed 18 November 2012,
< http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_049741.hcsp?dDocName=bok1_049741 >

Michelle, MN 2012, 5 basics of big data, Healthcare IT News, viewed 18 November 2012,
< http://www.healthcareitnews.com/news/5-basics-big-data >

What is big data? 2012, International Business Machines Corporation (IBM), viewed 18 November 2012, < http://www-01.ibm.com/software/data/bigdata/ >