185 Madison Ave. at 34th St., Suite 1104, New York, NY 10016 • 212-684-5151
Database Cleansing
Database Cleansing

Price: $425.00

Course Code: SF | Hours: 8

Database Cleansing Training

The utility of databases has skyrocketed over the years, for very good reason. Databases like SQL, Oracle, Excel and Access facilitate storage and analysis of data in ways never thought possible just a couple of years ago. Companies analyze data sets to make critical decisions that affect virtually every aspect of their businesses.

If your analysis is as good as your data, how do you make sure you have the best data?

Database Cleansing is a course that discusses how to maintain and protect the integrity of data to facilitate optimal analysis and decision making.Before analyzing large sets of data it is important to cleanse your data from imperfections like:

  • Duplicates
  • Typos
  • Redundant data
  • Missing data
  • Inconsistently named data
  • Corrupt data
  • Import errors
  • Data type mismatches

Database Cleansing details how to spot and fix Dirty-Data Issues to leave you with the clean and useful data necessary to facilitate optimal decision making.

Database Cleansing also deals with integrating data from many different sources such as websites, pdfs, graphic images, text files, etc. into clean data sets that can be merged with more traditional databases like SQL, Oracle, Excel and Access. You will learn how to use various tools, including Optical Character Recognition Programs to strip useful data from these challenging sources and correctly format them for their target destinations.

Finally, Database Cleansing will teach Best Practices in database construction, and maintenance to preserve your data integrity. You will learn tips and tricks to make sure your data will be complete, accurate and easy to reference.

Upon successful completion of this course, students will be able to:

  • Analyze data integrity
  • Cleanse data from common mistakes
  • Utilize formulas to correct poor data entry
  • Structure tables optimally for import
  • Identify potential bad practices
  • Break text strings into component parts
  • Extract data from multiple file types
  • Correct data due to conflicting formats
  • Use tools such as Optical Character Recognition Programs to strip useful data from challenging sources
  • Apply Best Practices in database construction and maintenance to preserve data integrity


Students should have knowledge of the basics of a relational database such as SQL, Access, Excel or Oracle

Related Classes:

See Also:

Database Cleansing Training Outline:

Lesson 1: Identifying Problems

        • Topic 1A: Poor Data Entry
        • Topic 1B: Redundant Data
        • Topic 1C: Unmatched Data
        • Topic 1D: Incorrect Data Types
        • Topic 1E: Improper Table Structures

Lesson 2: Exploring Various Tools

        • Topic 2A: Excel Worksheet Functions & Tools
        • Topic 2B: Excel Visual Basic Procedures
        • Topic 2C: Access and SQL Server Functions
        • Topic 2D: Find and Replace Tools
        • Topic 2E: Using Notepad

Lesson 3: Interactive Workshops

        • Topic 3A: Importing and Exporting Across Varying Platforms
        • Topic 3B: Extracting Tables from PDF Files
        • Topic 3C: Writing Cleanup Functions in Excel
        • Topic 3D: Using VBA to Automate Normalization of Table Structures
        • Topic 3E: Understanding and Building Referential Integrity