Data are measurements or observations that are collected as a source of information.It can be numbers, words, measurements, observation or just description of things. Example: 36 degrees, cold, number of hospital beds, height, weigtht, age, level of severity of disease
A Variable is a characteristic or attribute of an information that describes a person, place, thing, or idea and can assume different values. The value of the variable can "vary" from one entity to another
Example - temperature of a room, a person's hair color is a potential variable, which could have the value of "blond" for one person and "brunette" for another
We could distinguish between two different variables Based on the Level of Measurement
- Quantitative Variable
- A quantitative variable is one in which the variates differ in magnitude, e.g. income, age, etc.
- Qualitative/Categorial Variable
- A qualitative variable is one in which the variates differ in kind rather than in magnitude, e.g. marital status, gender, nationality, etc.
- Qualitative DataQuantitative DataOverview:
- Deals with descriptions.
- The variates differ in kind rather than magnitude
- Data can be observed but not measured.
- Colors, textures, smells, tastes, appearance, type, etc.
- Qualitative → Quality
Overview:- Deals with numbers.
- The variates differ in magnitude
- Data which can be measured.
- Length, height, area, volume, weight, speed, time, temperature, humidity, sound levels, cost, members, ages, etc.
- Quantitative → Quantity
Example :
LatteQualitative data:- robust aroma
- frothy appearance
- strong taste
- burgundy cup
Example :
LatteQuantitative data:- 12 ounces of latte
- serving temperature 150ยบ F.
- serving cup 7 inches in height
- cost $4.95
- Based on Statistical model there are two kinds of variable,
- Response Variable
- The outcome of a study or . A variable you would be interested in predicting or forecasting. Often called a dependent variable whose value is dependent on the predicted variable.
- Explanatory Variable
- Any variable that explains the response variable. Often called an independent variable or predictor variable.
Based on number of variables in a study, we have the following types of data,
Univariate Data
· Central tendency - mean, mode, median
· Dispersion - range, varience, max, min, quartiles, standard deviation
· Frequency Distribution
· Bar graph, histogram, pir-chart, line graph, box-and-whiskers plot
Bivariate Data
Involves two variables, deals with causes and relationship. The major purpose of the bivariate is to explain
· Analysis of two variables simultaneously
· Correlation, comparisons, causes, relationships, explanations
· Tables where one variable is contigent on the values of the other variable
· Independent and dependent variables
Data Types
A data type or simply type is a classification identifying one of various types of data, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of that type can be stored
- Intergers
- Booleans
- Characters
- Floating numbers
- Alphanumeric strings
Every individual data value has a data type that tells us what sort of value it is. The most common data types are NUMBERS, which R calls numeric values, and TEXT, which R calls character values. R also has LOGICAL values.
SAS Data Types
Internally, SAS supports two data types for storing data: CHAR (fixed-length character data, 32,767-character maximum ) NUM (double-precision floating-point number )
Note: If the data field is longer than 254 characters, the SAS ODBC Driver processes it as the ODBC data type SQL_VARCHAR.
By using SAS format information, the SAS ODBC Driver is able to represent other ODBC data types, both when responding to queries, and in CREATE TABLE statements
CREATE TABLE
|
ODBC Data Type
|
SAS Data Type
|
Every field in a JMP file has a name and a data type. The data type indicates how much physical storage to set aside for the field and the format in which the data is stored.
Data Types in STATA
Each element of data is said to be either type string or numeric.
STRING: Associated with each data type is a storage type. Say Str1, Str2, Str3...etc.
NUMBER: Numbers are stored as byte, int, long, float, or double, with the default being float. byte, int, and long are said to be of integer type in that they can hold only integers.
Data Structures
A data structure is an actual implementation of a particular abstract data type.
Because many different languages approach the construction of data structures differently
Data structure refers to the way data is organized and manipulated. It seeks to find ways to make data access more efficient. When dealing with data structure, we not only focus on one piece of data, but rather different set of data and how they can relate to one another in an organized manner.
Examples: Arrays, Lists, Iterators, Stacks & Queues, Binary trees, Min & Max Heaps, Graphs, Hash Tables, Sets and Tradeoffs.
Vectors
- Factors
A two-dimensional collection of values that all have the same type. The values are arranged in rows and columns.
There is also an array data structure that extends this idea to more than two dimensions.
- Data frames
-
A collection of vectors that all have the same length. This is like a matrix, except that each column can contain a different data type.
A data frame can be used to represent an entire data set.
- Lists
-