Glossary – Database

Context: Databases are data stores that are in themselves, organized mechanisms for storing, managing and efficiently retrieving raw data. Databases have other features that allow data to be compiled in the form of reports. Stored data can be collated or statistically counted so that a business can learn or make decisions based on how the business intelligence data is analyzed. Other features are related to importing or consuming more data in automated (hands free) ways.

Note: This glossary section serves to clarify any terms used in related posts on this site or in external internet literature; business jargon, engineering vocabulary or technical terms and meanings associated with the use of database data stores are covered here.
.

---     ---     ---

Database engines

Of the many database products on the market, some are open source and some support additional features like business intelligence support, scripting / programming support, here is a list of some database engines / database tools.

Access, CassandraClipper, dBaseFireBird, HBaseIBM DB2, IBM Informix, IngressInterBase, InterSystems Cache, Lotus, MySQL, MongoDBNeo4j graph dbaseOracle, Paradox, PostgreSQL, SQLite, SQL Server, Teradata, Visual Foxpro,  and there are many more …
.

---     ---     ---

Data Ingested / Data Ingestion

The act of consuming / importing / absorbing data into a companies data store database; this can be a computerised / automatic hands free process. It’s common to discuss Data Ingestion in the context of database migration or in the context of consuming large amounts of data for use in reports or data intelligence processing.

.

---     ---     ---

Data Transformation

Data is copied from a source system and translated, transformed or restructured into new data values, these replacement data values might also be formatted differently and / or stored as different data types to be compatible with the destination database system.

Transformation Scenarios: –

  • database-schema_640 Source data needs to be merged with data values from another database source before being stored into a destination system (ssis Union / Aggregate transformations, Merge / Join transformation)
  • Composite data values needs to be split into several separate column values (ssis Conditional Split or Derived Column Transformation)
  • Faulty data values need to be fixed and corrected or cleansed (ssis Data Conversion, Convert or Cast, Derived column)
  • Null or missing values need to be replaced with defined business default values.
  • Certain / targeted data values need to be filtered out and not stored into the destination system (ssis Lookup transformations, Sort transformation).

data_binary_640Transformation algorithms and automation processes convert data or information before storing the transformed values into the destination system.

Data mappings are used to identify where data comes from and where it should be directed to; a data dictionary is used to classify the target data types, the format of stored data and the rules for data transformation. The data dictionary is also used to clarify ownership and administration responsibilities for data wherever it is stored.

.

---     ---     ---

ETL (Extract-Transform-Load)

3 steps to consume data into a business data store

ETL is used in database systems where we require additional data  to be consumed or merged with company data for use in business intelligence / analysis of financials / marketing  / commodities prices and market trends / …

For example one might want to consume data from a government or European union data provider and use this data in conjunction with company data to provide insights into market trends / identifying new prospects or business opportunities / understanding market share or demographic changes over time / …

Extract: Covers the data extraction of data records from an external data source such as the ONS Office for national statistics web site / ECB European Central Bank / some other European data source / …

Transform: A set of data processing algorithms or rules to transform the data from the source into a form that is compatible with the company data store database. In addition it’s often necessary to clean data records so that empty / missing data values are transformed into default values that can be used by any computer programs or company web sites without creating additional problems.

Load: The loading of this new data into the companies data store database.  In addition it’s often necessary to ensure that duplicate data records are removed.

.

---     ---     ---

RDBMS (Relational database management system)

A data store that stores data inside data containers called data tables.  A database contains a set of tables containing data fitted into predefined categories. Each table contains one or more data categories in columns.

A table is a collection of data elements organized in terms of rows and columns. A table is also considered as convenient representation of relations. Data must be stored and presented as relations, i.e. tables that have relationships with each other, e.g., primary/foreign keys.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Creativity seeds Inspiration

%d bloggers like this: