VegBranch Normalizer
Login | Datasets | Logout
 

Normalizer

How it works Examples Download it

The Normalizer is included in vegbranch in versions beta 4.1 and higher. You can download it as a stand-alone MS-Access database with the above link.

The Normalizer takes a denormalized .csv file and normalizes it. You can specify criteria for this normalization, or let it AutoDetect them. In order to understand this, you must first understand what denormalized and normalized data are.

Normalized data have only one field (column) for each data field. The same data field does not appear in multiple columns. This is "standard" database architecture, but often users have application which require denormalized data, which is often more compact and perhaps more useful in specific situations.

Denormalized data have multiple fields (columns) that describe the same data field, each listed under a different column value for a separate field. For example, if you have many columns with species names and the cover value listed below each, that is a denormalized table (see example 1 below). The normalized table would have only species name in the one column, and cover value in the next column.

The method of labelling the denormalized ID rows to the left, ID rows above, and data cells themselves are my convention. Columns are labelled above, Rows are labelled to the left, and where all three collide, a single cell represents all three labels by using a pipe symbol, | , to separate column label, row label, and data cell label, respectively.

How normalization works:

Combination of first plot and first species : row 1
    
 Species ASpecies B
Plot 145% 
Plot 233%18%
produces:
Plot 1Species A45%
Combination of first plot and second species : row 2
    
 Species ASpecies B
Plot 145% 
Plot 233%18%
produces:
Plot 1Species A45%
Plot 1Species B 
Combination of second plot and first species : row 3
    
 Species ASpecies B
Plot 145% 
Plot 233%18%
produces:
Plot 1Species A45%
Plot 1Species B 
Plot 2Species A33%
Combination of second plot and second species : row 4
    
 Species ASpecies B
Plot 145% 
Plot 233%18%
produces:
Plot 1Species A45%
Plot 1Species B 
Plot 2Species A33%
Plot 2Species B18%

Examples

Download the following examples as .csv files, plus a stems example here.

Example 1: Simple example, plots by species with cover values:

(1 ID row, 1 ID column, no labels)
 Species ASpecies BSpecies CSpecies D
Plot 145% 2% 
Plot 233%18% 44%
Plot 3 7% 41%
Plot 42% 87%8%
Key to colors

NORMALIZED output:

unknownLeftLabel1unknownTopLabel1unknownCellDataLabel
Plot 1Species A45%
Plot 1Species B 
Plot 1Species C2%
Plot 1Species D 
Plot 2Species A33%
Plot 2Species B18%
Plot 2Species C 
Plot 2Species D44%
Plot 3Species A 
Plot 3Species B7%
Plot 3Species C 
Plot 3Species D41%
Plot 4Species A2%
Plot 4Species B 
Plot 4Species C87%
Plot 4Species D8%

Example 2: same as example 1, except includes a complex label for field names:

Plot|Spec|CoverPercent

The complex label consists of three parts: 1) the label for the data that appears in the left column, 2) the label for data in the first row, above the cover data, and 3) the label for the intersection cells, here cover percent.
These are divided with a | in the .csv file. This essentially combines three cells into one.
The three cells in which the complex label could be divided could be colored like this:

Plot|Spec|CoverPercent
(1 ID row, 1 ID column, labels)
Plot|Spec|CoverPercentSpecies ASpecies BSpecies CSpecies D
Plot 145% 2% 
Plot 233%18% 44%
Plot 3 7% 41%
Plot 42% 87%8%
Key to colors

NORMALIZED output:

PlotSpecCoverPercent
Plot 1Species A45%
Plot 1Species B 
Plot 1Species C2%
Plot 1Species D 
Plot 2Species A33%
Plot 2Species B18%
Plot 2Species C 
Plot 2Species D44%
Plot 3Species A 
Plot 3Species B7%
Plot 3Species C 
Plot 3Species D41%
Plot 4Species A2%
Plot 4Species B 
Plot 4Species C87%
Plot 4Species D8%

Example 3: A bit more complex. Two data fields above and to the left of the cover data, no field labels:

(2 ID rows, 2 ID columns, no labels)
  ferntreetreeshrub
  Species ASpecies BSpecies CSpecies D
steepPlot 145% 2% 
very steepPlot 233%18% 44%
flatPlot 3 7% 41%
flatPlot 42% 87%8%
Key to colors

NORMALIZED output:

unknownLeftLabel1unknownLeftLabel2unknownTopLabel1unknownTopLabel2unknownCellDataLabel
steepPlot 1fernSpecies A45%
steepPlot 1treeSpecies B 
steepPlot 1treeSpecies C2%
steepPlot 1shrubSpecies D 
very steepPlot 2fernSpecies A33%
very steepPlot 2treeSpecies B18%
very steepPlot 2treeSpecies C 
very steepPlot 2shrubSpecies D44%
flatPlot 3fernSpecies A 
flatPlot 3treeSpecies B7%
flatPlot 3treeSpecies C 
flatPlot 3shrubSpecies D41%
flatPlot 4fernSpecies A2%
flatPlot 4treeSpecies B 
flatPlot 4treeSpecies C87%
flatPlot 4shrubSpecies D8%

Example 4: Same as example 3, but added field labels:

(2 ID rows, 2 ID columns, labels)
 SpeciesTypeferntreetreeshrub
PlotSlopePlotID|Species|CoverSpecies ASpecies BSpecies CSpecies D
steepPlot 145% 2% 
very steepPlot 233%18% 44%
flatPlot 3 7% 41%
flatPlot 42% 87%8%
Key to colors

NORMALIZED output:

PlotSlopePlotIDSpeciesTypeSpeciesCover
steepPlot 1fernSpecies A45%
steepPlot 1treeSpecies B 
steepPlot 1treeSpecies C2%
steepPlot 1shrubSpecies D 
very steepPlot 2fernSpecies A33%
very steepPlot 2treeSpecies B18%
very steepPlot 2treeSpecies C 
very steepPlot 2shrubSpecies D44%
flatPlot 3fernSpecies A 
flatPlot 3treeSpecies B7%
flatPlot 3treeSpecies C 
flatPlot 3shrubSpecies D41%
flatPlot 4fernSpecies A2%
flatPlot 4treeSpecies B 
flatPlot 4treeSpecies C87%
flatPlot 4shrubSpecies D8%

Example 5: more complicated example (differing number of left and top fields, includes labels):

(4 ID rows, 3 ID columns, labels)
  SpecSynonym   Species RR
  SpeciesCodeACE1RGE3EDGRSHe2
  SpeciesTypeferntreetreeshrub
PlotSizePlotSlopePlotID|Species|CoverSpecies ASpecies BSpecies CSpecies D
44steepPlot 145% 2% 
50very steepPlot 233%18% 44%
80flatPlot 3 7% 41%
100flatPlot 42% 87%8%
Key to colors

NORMALIZED output:

PlotSizePlotSlopePlotIDSpecSynonymSpeciesCodeSpeciesTypeSpeciesCover
44steepPlot 1 ACE1fernSpecies A45%
44steepPlot 1 RGE3treeSpecies B 
44steepPlot 1 EDGRtreeSpecies C2%
44steepPlot 1Species RRSHe2shrubSpecies D 
50very steepPlot 2 ACE1fernSpecies A33%
50very steepPlot 2 RGE3treeSpecies B18%
50very steepPlot 2 EDGRtreeSpecies C 
50very steepPlot 2Species RRSHe2shrubSpecies D44%
80flatPlot 3 ACE1fernSpecies A 
80flatPlot 3 RGE3treeSpecies B7%
80flatPlot 3 EDGRtreeSpecies C 
80flatPlot 3Species RRSHe2shrubSpecies D41%
100flatPlot 4 ACE1fernSpecies A2%
100flatPlot 4 RGE3treeSpecies B 
100flatPlot 4 EDGRtreeSpecies C87%
100flatPlot 4Species RRSHe2shrubSpecies D8%

Color Key: (back)

color and nameexplanation
empty cell or unknown column label derived from an empty cell in normalized table.
Left Labellabels the "left data" column that appears below it.
Top Labellabels the "top data" row that appears to its right.
Complex labelconsists first of the last "left label", then "|", then the last "top label", then "|", then "cell data" label.
Left Data
(ID columns)
data that applies to all "cell data" in the same row.
Top Data
(ID rows)
data that applies to all "cell data" in the same column.
Cell datalinked to the "top data" in same column and "left data" in same row.
Highlighted cellillustrates something.