2016: The Year of the Zettabyte

My last post on Big Data was way back February 3, 2013! This weekend I hope to continue on Big Data posts and post it by early next week,

However this evening I stumbled upon a new infographic related to my Big Data post posted February 3, 2013. In that post I wrote about the volume of data that is increasing exponentially on an annual basis and to give you an idea of how that is developing in two infographics, courtesy of the online storage site Mozy, and Cisco to help you visualise the meaning of pentabytes of data and how it expands further into zettabytes sometime into the future.

Well, by end of 2016, the world according to Cisco’s Visual Networking Index, will cross the Zettabyte threshold largely contributed by video streaming, phone lines or video calling and mobile streaming accelerated via extremely fast Internet speeds and data transfers.

The following infographic is a visual of how big zettabytes will be.

INFOGRAPHIC - 2016 - The Year Of The Zettabyte V6

References:

  1. XO Communications, Are you Ready for 2016: The Year of the Zettabyte, viewed 28 March 2013, <http://www.xo.com/services/Pages/2016-The-Year-of-the-Zettabyte.aspx>

Electronic vs Paper medical records – tracking down John Doe’s medical records

Many Health Information Management (HIM) / Medical Records (MR) practitioners worldwide are still stuck with the conventional paper-based medical record. The infographic in this post (you can view a larger image by first clicking on the image below which will open in a new tab of your current window and then clicking again on the image in the new tab) is a typical scenario of “missing” medical records, and offsite storage which continues to post many problems from logistics to damaged medical records.

Electronic medical records seem to drive greater efficiency in the storage of medical information, and it seems to me perhaps the best possible path and solution for the betterment of medical records management. HIM / MR practitioners practicing in such an environment will know its impact.

ICD 10 & ICD 11 Development – How, What, Why & When

I have enrolled as an International Classification of Diseases, 11th Revision Beta phase participant. To participate proactively, I will have to make comments, make proposals, propose definitions of diseases in a structured way, will be given a chance to participate in Field Trials, and perhaps assist in translating ICD into other languages. This is not going to be an easy thing to do and one definitely needs knowledge of the ICD. Having worked with ICD 10, I will have to use my ICD 10 experiences and try to contribute to the Beta phase.

So here is the first post from what will be a series of posts I shall blog about as I explore what is going on in the development of ICD 11.

Below is an infographic I painted to begin my first post. The infographic (you can view a larger image by first clicking on the image below which will open in a new tab of your current window and then clicking again on the image in the new tab) summarises facts I have found from the reference list below. They are by no means exhaustive.
References:
Can, Ç 2007, Production of ICD-11:The overall revision process, viewed 20 December 2012, < http://www.who.int/classifications/icd/ICDRevision.pdf >

James, H, ICD-11 in eleven points An update, Research Centre for Injury Studies • Flinders University, Adelaide, viewed 23 December 2012, < http://dxrevisionwatch.files.wordpress.com/2012/07/harrisonslidesamdigumd2011.pdf >

International Statistical Classification of Diseases and Related Health Problems, Volume 2 Instruction manual 2011, 2010 edn, World Health Organization, Geneva, Switzerland

World Health Organization, 2012, Classifications, viewed 18 December 2012, < http://www.who.int/classifications/icd/revision/en/ >

Big Data – Big Data Basics

Big Data 3Vs cardboard-box-icon

This post is to continue from the introductory post Big Data – Introduction (this link will open in a new tab of your current browser window) on Big Data about the “3Vs” that define Big Data. As I researched the subject of Big Data, three terms – Volume, Velocity and Variety stood out in relation to the “3Vs” of Big Data which leads me to explain to you in this post the widely accepted definition of Big Data from Gartner (the world’s leading information technology research and advisory company) analyst Doug Laney who has characterised Big Data as “data that’s an order of magnitude greater than data you’re accustomed to.”

Accordingly, this “3Vs” model for describing Big Data spans three dimensions, data increasing in volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources).

The first dimension/characteristic, Volume is about how Ed Dumbill, program chair for the O’Reilly Strata Conference (the leading event that offers the nuts-and-bolts of building a data-driven business – the latest on the skills, tools, and technologies you need to make data work and bringing together practitioners, researchers, IT leaders and entrepreneurs to discuss big data, Hadoop, analytics, visualisation and data markets –  the people and technology driving the data revolution), describes Big Data as “data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.”

To give you an idea of the volume of data that is increasing exponentially on an annual basis, customer transactions at Walmart is reported to estimate to more than 2.5 petabytes of data every hour. Perhaps these infographics, courtesy of the online storage site Mozy, and Cisco will help you visualise the meaning of pentabytes of data and how it expands further into zettabytes sometime into the future.

Visualizing The Pentabyte Age

Infographic credit : http://mozy.com/blog/misc/how-much-is-a-petabyte/

The Internet in 2015

Infographic credit : http://blogs.cisco.com/news/the-dawn-of-the-zettabyte-era-infographic/

Velocity, the second dimension/characteristic describes the frequency at which data is generated, captured and shared in every imaginable device that all produce torrents of data.

I am sure you have heard of a batch process that takes a chunk of data, submits a job to the server and waits for delivery of the result. In a batch process, the incoming data rate is slower than the batch processing rate but the result is useful despite the delay. For many new applications sources of data, the batch process is just not possible anymore since the speed of data creation is even more important  than the volume. The data is now real-time or nearly real-time  information streaming into the server in a continuous fashion.

The available data in the world today comes from everywhere, this Variety, the third dimension/characteristic signifies the proliferation of data types that add new data types  which no longer fits into neat, easy to consume structures of traditional transactional data, all of which exists as a by-product of ordinary  operations: those being generated by humans from posts to social media sites, digital pictures and videos, purchase transaction records, and GPS signals from cell phones, and from “sensor” data generated from computers and network devices and embedded chips used to gather climate information, from refrigerators and airplanes to bodily implants, and more.

The International Business Machines Corporation (IBM) adds Veracity as the fourth dimension of Big Data. Veracity is when the confidence of the quality (precision and accuracy) of the variety and number of information sources is doubted.

I guess this is enough to known briefly about the basics of Big Data.

References:
About 2012, O’Reilly Strata Conference, viewed 13 December 2012, < http://strataconf.com/strata2012/public/content/about >

Andrew, M & Erik, B 2012, Big Data: The Management Revolution, Harvard Business Review October 2012, Boston, MA, USA

Dave, F 2012, The 3 I’s Of Big Data, Forbes, viewed 13 December 2012,
< http://www.forbes.com/sites/davefeinleib/2012/07/09/the-3-is-of-big-data/ >

Diya, S 2012, The 3Vs that define Big Data, Data Science Central, viewed 13 December 2012, < http://www.datasciencecentral.com/forum/topics/the-3vs-that-define-big-data >

Lorraine, F, Michele, O’C,  & Victoria, W 2012, Data, Bigger Outcomes, American Health Information Management Association, viewed 18 November 2012,
< http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_049741.hcsp?dDocName=bok1_049741 >

Stefan, S 2012, The 3 V of BIG Data, Agile Commerce, viewed 13 December 2012,
< http://multichannel-retailing.com/2012/05/the-3-v-of-big-data/ >

What is big data? 2012, International Business Machines Corporation (IBM), viewed 18 November 2012, < http://www-01.ibm.com/software/data/bigdata/ >

Big Data – Introduction

There is a lot of buzzes and there is a lot of emerging hype on Big Data, and I like to begin a series of posts on everything, well almost everything that I can capture, store, search, share, analyse and visualise there is to know about Big Data. I think I will by no means have discussed everything there is to know about Big Data in a running series of posts in this blog as I think Big Data is so huge that I can start an entire blog devoted to just this.

As this blog is about healthcare data and information issues,  the issue of Big Data in healthcare is that there’s a tremendous amount of data and information about the patient. I wish to think it is organised, but the real issue is, that it isn’t organised as well as it should be and all that data is a mixture of structured and unstructured data. Here is when I like to agree with Joe Petro, senior vice president of healthcare research and development at Nuance Communications who sums up the current state of big data. Petro believes that there’s a tremendous amount of information when you’re in the institution – it is a big data problem, you’re trying to figure out what’s going on and how to report on something and you’re dying of thirst in a sea of information, and the issue is how to tap into that to make sense of what’s going on.

Big Data is everywhere, not just in healthcare but as well in as many other sectors of the global economy.

WHERE HAS BIG DATA IN HEALTHCARE COME FROM?

The separation of data among hospital systems – clinical components, laboratories, and radiology are all separate repositories for information. The main issue is with leveraging all of these data. Their use is to provide clinical care or provide scheduling information or operational information. Often there is a problem if we want systems to talk to each other. An organisation can also end up with redundant information due to a legacy system, a system we may continue to use, sometimes well past its vendor-supported lifetime, resulting in support and maintenance challenges. It may be that the system still provides for the users’ needs, even though newer technology or more efficient methods of performing a task are now available.

WHAT IS BIG DATA, IN A NUTSHELL?

Big Data thus refers to sets of data that are so large, that they become awkward and complex that traditional database management tools struggle to capture, store, analyse and share this information. Difficulties include capture, storage, search, sharing, analytics, and visualizing.

IBM (IBM 2012) claims that “every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few.” IBM adds that “This data is big data.”

Big data spans three dimensions, sometimes referred to as the 3 “Vs”: Volume, Velocity, and Variety. But IBM says Big Data spans four dimensions: Volume, Velocity, Variety, and Veracity.

Image credit : : http://www.asigra.com

I shall leave this post for now with this infographic(Click on the image above to view the image in a new tab of your current window and in order to obtain a larger image or a closer view of the image in this new tab, zoom in) and continue with the “Vs” and Big Data Basics in the next post on Big Data.

References:
Lorraine, F, Michele, O’C,  & Victoria, W 2012, Data, Bigger Outcomes, American Health Information Management Association, viewed 18 November 2012,
< http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_049741.hcsp?dDocName=bok1_049741 >

Michelle, MN 2012, 5 basics of big data, Healthcare IT News, viewed 18 November 2012,
< http://www.healthcareitnews.com/news/5-basics-big-data >

What is big data? 2012, International Business Machines Corporation (IBM), viewed 18 November 2012, < http://www-01.ibm.com/software/data/bigdata/ >