Methodology
The Business Database
ERISS uses
a commercially available database of local businesses within specific
size and industry categories. The database is then subjected to a quality
assurance review, including an examination of business records for correct
Standard Industrial Classification (SIC), name, titles, size and other
relevant data points. This database is integrated into our current survey
programming processes and the businesses are targeted based on their
SIC-code, size and employment projections.
Sampling
ERISS conducts
large-scale, representative sample surveys using a technique known as
"stratified proportional quota sampling." This technique enables
researchers to accurately represent the major characteristics of a population
by sampling a proportional amount of each. Pre-determined population
characteristics (such as company size, industry, and region) serve as
stratification criteria. The result of this sampling design is an optimized
allocation of the sample among the various strata. Additionally, this
technique allows researchers a great deal of flexibility in terms of
representation of pre-chosen strata within the population. For example,
a particular region may have relatively few large companies, but these
companies account for a large proportion of regional employment. Purposive
over sampling of larger companies ensures the employees and occupations
at these companies are represented in the survey to the degree they
impact the region, not just their proportionate representation in the
overall population.
Finally,
ERISS combines this sampling method with a "census" survey
approach to ensure as many businesses are contacted as possible. To
this end, ERISS attempts to contact every business in the database and
achieves response rates of 20% to 40%. This means that for an area with
20,000 businesses, we would obtain responses from between 4,000 to 8,000
firms.
The higher
number of responses ensures sufficient data is gathered to provide detailed
local information, even when stratified by sub-regions, industry classifications,
size and even customized industry sub-clusters. The result of such a
large sample and the use of stratified proportional quota sampling results
in representative data on a number of dimensions. Random samples can
also be stratified on a multiple dimensions, but because fewer businesses
are surveyed in a random sample, these stratifications are based on
fewer responses and may have a higher level of error.
Methodological
Tradeoffs
There are
inherent advantages and disadvantages to any research methodology. There
is no single method best suited to all types of research. Business research
is typified by a series of tradeoffs that are best characterized in
terms of cost/benefit. Research goals, intended use of the data and
available resources largely dictate the optimum method of choice. The
method that returns the most value is often not the most theoretically
desirable, but rather the most valid for the purpose and budget of the
survey.
Obtaining
a true random sample along all desired stratification levels usually
requires the expenditure of resources greatly out of proportion to the
expected returns. For the applied purposes of most workforce professionals,
surveying the greatest number of employers possible in a representative,
valid, and cost effective manner is of greater importance than achieving
a true random sample that allows for the use of statistical techniques
that are often of little applied value to the purposes of the survey
on a local workforce development level. Therefore, for research involving
job markets, sample representativeness is often a more important consideration
than true randomization (which is better suited to research more experimental
in nature and design, and where timeliness of the data is not an issue).
Additionally,
due to selection bias, non-response, inaccurate or non-representative
database source, attrition and other factors, studies originally designed
as random sample surveys often do not qualify as random after the survey
is complete. Due to a dependence on randomization to ensure the generalizability
of survey results (rather than targeting individual strata or other
techniques to ensure representativeness), the final result of some "random
sample" surveys are samples that are neither random nor representative.
Assuming
a true random sample is achieved, what are the advantages and disadvantages
of true random sampling over other techniques such as proportional quota
sampling? The primary advantage is that statistical theory can be used
to make generalizations from the sample of employers to the population.
However, sampling theory dictates that the more units sampled from a
population, the closer the sample comes to approximating the characteristics
of the population (assuming sampling is equivalent across various demographic
dimensions, such as company size). When a large proportion of the business
population is surveyed, the ability to make accurate statements about
those businesses is largely equivalent to making statements about the
population as a whole. Additionally, measures to ensure and gauge external
validity can be utilized. For example, a comparison of ERISS data for
a key factor such as salary, with pre-existing salary information about
the target population can add confidence to the survey data.
In sum,
as compared to a random sample survey, the primary advantage of our
approach is a representative and valid sample containing a higher number
of responses, representing more occupations, with a lower cost, and
most importantly, more timely results.
Tracking
of Stratification Parameters
As previously
described, the sample is stratified in terms of industry, company size,
and region. This is accomplished by means of real-time data tracking
and targeting. The emerging sample is inspected daily, and resources
are adjusted and reallocated to ensure representativeness in terms of
each stratification parameter. For example, if our database indicates
that the Business Services industry accounts for roughly 8% of all businesses
in a region, steps are taken to ensure this industry represents approximately
8% of all industries in the final survey data. Similarly, company size
(regardless of industry assignment), and regional representation are
tracked and adjusted. The survey is not considered complete until all
stratification parameters fall within the predetermined ranges, and
the overall required response rate has been achieved.
The
Staffing Pattern
ERISS employs
sophisticated staffing patterns to determine which occupations to survey
for each individual business. These staffing patterns are based upon
interviews with more than 250,000 employers, crossing four-digit SIC
codes to enhanced, 8-digit O*NET codes (ERISS uses customized eight-digit
O*NET codes to account for new and emerging occupations not represented
in the standard O*NET coding system).
Considering
the volatility of occupational titles and the evolution and development
of new industries and categories of businesses, staffing patterns tend
to lose their ability to reflect occupational trends over time. Therefore,
the ERISS staffing patterns are modified and updated to accurately reflect
occupations in the current labor market. This is accomplished by continuous
adjustments and improvements as we incorporate information from each
completed survey to reflect new occupations or existing occupations
appearing in new industrial classifications.
Occupational
Targeting
ERISS deploys
both preset and dynamic targeting criteria for occupational selection.
Preset targeting criteria include: · Sponsor requests - occupations
that are selected as "special interest" occupations by our
customers. Occupational "rarity" - for example, when surveying
fire stations, the occupation "fire chief" would be given
top priority since they are found nowhere else. Number of potential
employers - the more potential employers, the higher the priority.
The ERISS
dynamic targeting process is designed to adjust the parameters of the
survey in real-time, as the survey is being conducted. The targeting
program works by following the pre-set parameters to ensure the proper
amounts and types of industries, occupations and company sizes are surveyed.
These parameters operate at multiple levels simultaneously and the number
of surveys required for each occupation varies according to the fulfillment
of certain preset requirements as well as the characteristics of the
developing survey data
To determine
the proper number of surveys required for each occupation, dynamically
adjusting thresholds, referred to as "floors" and "ceilings,"
are used. The "ceiling" is defined as the point at which the
collection of more data for a particular occupation will no longer significantly
impact the results. The "floor" refers to the minimum number
of occupational surveys required for occupational data to be considered
valid and publishable. In response to live survey data, the program
adjusts these thresholds for each occupation. For example, if the existing
data for a particular occupation is displaying a high amount of salary
variability, the program adjusts the floor so that a higher number of
these occupations are targeted for surveying, thereby increasing the
confidence in the data for that occupation. Occupations with very little
salary variability require fewer data points to achieve the same level
of confidence. As more occupational surveys are collected, and the dynamic
minimum threshold is reached, the probability that the occupation will
be selected for surveying declines until, after the maximum threshold
is reached, the probability becomes zero, and the occupation is no longer
surveyed.
Additionally,
before the survey, targeting parameters can be manually adjusted to
ensure that certain "VIP" businesses or other high-priority
targets, whether they be industries, regions, occupations or even specific
firms or companies, are sampled in sufficient numbers. One special feature
of the program is the ability to know when enough occupations have been
surveyed using more than simple "counts." For instance, some
occupations typically have a great deal of salary variability (i.e.
large range between the highest and lowest salaries). These types of
occupations require more data in order to make confident statements
concerning the average and median salaries. Using existing data from
the Bureau of Labor Statistics and other sources, these occupations
are identified prior to the start of the survey and assigned higher
completion thresholds ensuring adequate numbers are surveyed according
to the salary characteristics.
Data
Validation
Data is
validated at three points during the survey process; during the live
survey (real-time validation), spot checks of completed surveys (usually
within 24 hours of the survey) and after the survey calls are complete
(post-survey validation).
Real-time
validation is conducted along several dimensions, the first being during
keyboard entry. The ERISS CATI (Computer Aided Telephone Interview)
system performs extensive logic and range checking both in terms of
salary points, hiring and turnover numbers, etc. During the survey,
ERISS surveyors are also subject to both visual (observations of screens)
and audio ("silent" listening) monitoring. Experienced ERISS
team-leaders, trained in management and leadership listen-in on randomly
selected survey calls. This monitoring not only ensures that data is
collected in a standardized and reliable manner, but also that surveyors
are polite and professional when interviewing employers. Additionally,
ERISS survey team-leaders visually monitor the surveyors computer screens
to track survey progress and method.
Also during
the live-survey, survey statistics are monitored in real-time to track
trends in the developing database. For each individual surveyor as well
as at the aggregate level, these statistics document such factors as
number of calls, average length of call, successful surveys, phone appointments,
and various other factors crucial to tracking the live survey. This
information is also used to target surveyors for observation.
In addition
to the live-survey monitoring, daily tabulations of the survey data
are conducted. At the end of each day of survey operations, a review
and analysis of survey statistics and other statistical information
collected to date is conducted. These "daily" statistics are
used to analyze and interpret trends in key areas such as "representativeness"
by size and industry. This information is used to devise strategy and
targeting adjustments for the following days survey.
Also, daily
validation calls are made to participants to spot-check data. These
calls are made by experienced ERISS surveyors to randomly selected employers.
This not only provides a valuable validity check on the collected data,
but it also provides a check on the accuracy of individual surveyors.
Post-Survey
Validation
Once the
calls are complete, the raw data is processed by a series of post-processing
software programs developed specifically for the purpose. These programs
flag (and eliminate if desired) data outliers, determine and flag data
variability, create data aggregations and provide summary tables as
well as data validation tabulations. Data points that are considered
suspect or out-of-bounds according to a number of parameters are rechecked
by calling the relevant employer and asking validation questions. If
the data in question is still suspect, the decision can be made to remove
it from the data set Once all data runs are complete, the raw data is
then processed using SPSS software (Statistical Package for the Social
Sciences) macros to calculate a variety of statistical measures that
we then compare to the aggregate results above. Tests include response
distribution against the target population along geographical areas,
industry clusters and employer size. Based upon these data runs, we
create a summary of our findings relating to data validity by industry,
by area and by employer size by area.
Lastly,
the resulting data sets are evaluated against initial targeting parameters
as agreed upon with the survey sponsor. In consultation with the sponsor,
a determination is made as to whether the survey is complete, or whether
to continue the survey to gather additional information.
Guidelines
for Publication of Data
After the
survey is completed, not all data will meet minimum requirements for
publication. Although publication can take the form of an Internet application,
or a written report, these guidelines remain the same.
One requirement
is that there be enough observations to validly represent a particular
occupation as determined by the occupational "floors" discussed
above. As such, the required number of observations will vary from occupation
to occupation. For example, if we obtained four observations for "Police
Officer" from a total of five possible precincts, we publish. On
the other hand, a higher publishing criteria would exist for occupations
such as "General Managers and Top Executives," where there
are many more employers and higher data variability.
In order
to ensure confidentiality, ERISS will, under no circumstances, publish
occupational data representing less than four employers, nor any occupation
where just one employer represents an overall weight of 80% or more.
«Technology
| Survey»