OpenRefine : OpenRefine is an open-source desktop application for data cleanup and transformation to other formats, an activity commonly known as data wrangling. It is similar to spreadsheet applications, and can handle spreadsheet file formats such as CSV, but it behaves more like a database,previously known as GoogleRefine, is a powerful, open source software which visualizes and manipulates large quantities of data all at once. OpenRefine looks like a spreadsheet, but operates like a database, allowing for increased discovery capabilities beyond programs like Microsoft Excel.
To export your history: Go to the Undo/Redo tab to the left of the grid view. At the top of the pane, click the “Extract…” button. In the pop-up window, select the steps which you would like to export or simply select the “Select all” button.
Facets are one of OpenRefine’s strongest features – that’s where the diamond logo comes from! Faceting allows you to look for patterns and trends. Facets are essentially aspects or angles of data variance in a given column.
1). Introduction to OpenRefine :
- Understand the principles of data preparation
- Getting familiar with the OpenRefine community and the software interface
- Installing OpenRefine
2). Data Mining and Discovery :
- Learn the different facet type
- Learn to combine facet to create complex filtering
- Learn how to sort data in OpenRefine
3). Data Preparation and Normalization :
- Learn point and click data normalization (clustering, removing duplicate, splitting cells
- Understand how the history and undo / redo works
- Start using GREL to concatenate two fields
4). General Refine Expression Language :
- Understand and master GREL syntax
- Learn basic GREL expression to replace, split and compare string
5). Data Enrichment :
- Join OpenRefine project together
- Call an API to enrich your project
- Parse a JSON answer from an API