BIG DATA VISUALIZATION
With the increased amount and complexity attributable to Big Data, communicating the information mined from Big Data has increased in importance. Data visualization is now conceptualized as telling a narrative, and this concept underlies the work of authors who have addressed the topic.
Data analysis is an important part of data science, and one of the key concepts the authors define is the ability to think data-analytically. This involves “identifying appropriate data and consider[ing] appropriate methods” through which to analyze and use it.
In their book Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking, Foster Provost and Tom Fawcett, define data science as “a set of fundamental principles that guide the extraction of knowledge from data.”
STATISTICAL DATA ANALYSIS
Statistical data analysis is made possible because variability exists in any collection of items, measurements, or phenomena. A company’s sales, both quantities and amounts, differ from month to month, as do its expenditures.
Simple linear regression fits a line to a group of paired data points so that one number in the pair, called the independent variable (IV) can be used to predict the second, called the dependent variable (DV).
achine learning refers to a broad range of algorithms and computational techniques in which prior information is used to “teach” the algorithm how to predict future instances of a phenomenon.