EstdGrup: Fields statistics by grouping records according to values present in other fields

Direct access to online help: EstdGrup

Access the application from the menu: "Tools | Alphanumerical databases | Statistics of record groups"

Presentation and options	Dialog box of the application
Syntax

Presentation and options

In order to define the operation that this application will perform, the different fields of the database will be called grouping fields, calculated fields or simply non-participating fields.

The grouping fields are those which classify results according to the different values of each of these fields. Once a grouping field has been created, the program determines the number and the characteristics of the different values which are included in its database, which is organized in related form according to the identifiers of the graphic objects of the main table; all the different values of a field found in the database and considered as a whole are known as the field projection. The values of the indicated statistics for each calculated field are calculated for each element of the projection of the grouping field.

The calculated fields are the fields of the database about which the user wishes to know specific statistical calculations. The user can choose which statistics are calculated for each calculated field. Depending on the treatment that each field receives (normally quantitatively for numerical fields and categorically for the rest of the fields) the user can choose one or all the following statistical calculations:

Both treatments:
- Number of total records (includes NoData ⁽¹⁾)
- Number of records with data (excluding NoData ⁽¹⁾)
- Number of values of one element of the projection in relation to the total number, in percentage and without taking into account NoData values ⁽¹⁾
- Number of values of an element of the projection in relation to the total number, in percentage and taking into account the NoData values ⁽¹⁾
Categorical treatment:
- Mode; if there is more than one mode, this will be the first value ⁽²⁾
- Occurrence percentage of the mode excluding the NoData values
- Shannon index
- First value ⁽²⁾
- Last value ⁽²⁾
Quantitative treatment:
- Mean
- Standard deviation (dividing between N)
- Variance (dividing between N)
- Sum
- Rank (for whole numbers, 1+max-min)
- Minimum
- Maximum
- Median
- Mean absolute deviation around the median

(1) NoData for a database includes the explicit NoData values, empty records and blank records.
(2) For the numerical treatment the order is natural (1,2,3...), but for the categorical treatment it is necessary to take into account the type of order used; this is a non-strict alphabetical order, based on the order of the ASCII codes corresponding to each character. This means that A naturally comes before B, but, on the other hand, that a or à come after B.

In quantile calculations, such as the median, it is possible to indicate, with the modifier /MEDIANA_EMPAT=, the type of tiebreaker to use for its calculation when the position of the quantile is between two values of the series. For more information, see general syntax.

The result of the statistical analysis can be presented in three formats:

HTM: A single file in HTML format is generated which shows the statistical results for each defined grouping level in tabular form.
DBF: a DBF table is generated for each grouping field with all the records and values that the projection of the grouping field has and with the statistics of each calculated field as fields of this DBF table.
CSV: A spreadsheet in text format with a list separator (usually the character ;) is generated which has the same structure as the HTM format.

Depending on the output formats, the program has the following modes of operation:

EstdGrup HTM

EstdGrup DBF

EstdGrup CSV

Dialog box of the application

EstdGrup dialog box

Syntax

Syntax:

EstdGrup Option OriginFile DestFile [/GRP_#] [/ESTD_#] [/CAT_#] [/NUM_#] [/N_REG_TOTALS_#] [/N_REG_DADES_#] [/PRCNT_MODA_#] [/MODA_#] [/I_SHAN_#] [/MITJANA_#] [/DESV_STD_#] [/VAR_#] [/SUMA_#] [/MIN_#] [/MAX_#] [/PRCNT_GRUP_#] [/PRCNT_GRUP_NODATA_#] [/MEDIANA_#] [/DESV_MEDIANA_#] [/MEDIANA_EMPAT_#=] [/N_DECIMALS]
EstdGrup Option OriginFile DestDir [/GRP_#] [/ESTD_#] [/CAT_#] [/NUM_#] [/N_REG_TOTALS_#] [/N_REG_DADES_#] [/PRCNT_MODA_#] [/MODA_#] [/I_SHAN_#] [/MITJANA_#] [/DESV_STD_#] [/VAR_#] [/SUMA_#] [/MIN_#] [/MAX_#] [/PRCNT_GRUP_#] [/PRCNT_GRUP_NODATA_#] [/MEDIANA_#] [/DESV_MEDIANA_#] [/MEDIANA_EMPAT_#=] [/N_DECIMALS]
EstdGrup Option OriginFile DestFile [/GRP_#] [/ESTD_#] [/CAT_#] [/NUM_#] [/N_REG_TOTALS_#] [/N_REG_DADES_#] [/PRCNT_MODA_#] [/MODA_#] [/I_SHAN_#] [/MITJANA_#] [/DESV_STD_#] [/VAR_#] [/SUMA_#] [/MIN_#] [/MAX_#] [/PRCNT_GRUP_#] [/PRCNT_GRUP_NODATA_#] [/MEDIANA_#] [/DESV_MEDIANA_#] [/MEDIANA_EMPAT_#=] [/N_DECIMALS]

Options:

HTM (or 1): A report of the result of the statistical calculations is generated in HTML format.
DBF (or 2): As many tables in DBF format are generated as there are grouping fields in the destination directory. Each table contains the statistical results of its corresponding grouping field.
CSV (or 3): A spreadsheet in CSV format (which defines the fields using a list separator) is generated with the result of the statistical calculations.

Parameters:

OriginFile (Origin File - Input parameter): REL file corresponding to the database of a structured vector layer, or DBF table, with which the selected statistical calculations are carried out.
DestFile (Destination File - Output parameter): For HTM (1) and CSV (3) options this is the file that will contain the results of the statistical calculations indicated.

DestDir (Destination Directory - Output parameter): This is the directory in which the results of the corresponding DBF tables generated by each grouping field will be written.

Modifiers:

GRP_#

general sintax

ESTD_#

CAT_#

NUM_#

N_REG_TOTALS_#

N_REG_DADES_#

PRCNT_MODA_#

MODA_#

I_SHAN_#

MITJANA_#

DESV_STD_#

VAR_#

SUMA_#

MIN_#

MAX_#

PRCNT_GRUP_#

PRCNT_GRUP_NODATA_#

MEDIANA_#

DESV_MEDIANA_#

MEDIANA_EMPAT_#=

general syntax

N_DECIMALS