January 7 2009
 


       Home

     What We Do

       For Business
       Advocacy Groups

       For Co-Sponsors

       Current Projects &
       References

       Customers

       Data Samples

    PARTNERS:

     

Methodology

The Business Database

NCBER uses a commercially available database of local businesses within specific size categories. The database is then subjected to a quality assurance review, including an examination of business records for correct Standard Industrial Classification (SIC), name, titles, size and other relevant data points. This database is integrated into our survey programming processes and proprietary Computer Aided Telephone Interviewing (CATI) system.

Sampling

NCBER conducts large-scale, representative sample surveys using a technique known as "stratified proportional quota sampling." This technique enables researchers to accurately represent the major characteristics of a population by sampling a proportional amount of each. Pre-determined population characteristics (such as company size, industry, and region) serve as stratification criteria.

The result of this sampling design is an optimized allocation of the sample among the various strata. Additionally, this technique allows researchers a great deal of flexibility in terms of representation of pre-chosen strata within the population. For example, a particular region may have relatively few large companies, but these companies account for a large proportion of regional employment. Purposive over sampling of larger companies ensures the employees and occupations at these companies are represented in the survey to the degree they impact the region, not just their proportionate representation in the overall population.

Finally, NCBER combines this sampling method with a "census" survey approach to ensure as many businesses are contacted as possible. To this end, NCBER attempts to contact every business in the business database with 25 or more employees, and achieves response rates of 20% to 40%. This means that for an area with 20,000 businesses, we would obtain responses from between 4,000 to 8,000 businesses. This is conservatively four to ten times as many responses as would be obtained using a traditional stratified random sampling technique.

The higher number of responses ensures sufficient data is gathered to provide detailed local information, even when stratified by subregions, industry classifications business size and even customized industry sub-clusters. The result of such a large sample and the use of stratified proportional quota sampling results in representative data on a number of dimensions. Random samples can also be stratified on a multiple dimensions, but because fewer businesses are surveyed in a random sample, these stratifications are based on fewer responses and may have a higher level of error.

Methodological Tradeoffs

There are inherent advantages and disadvantages to any research methodology. There is no single method best suited to all types of research. Business research is typified by a series of tradeoffs that are best characterized in terms of cost/benefit. Research goals, intended use of the data and available resources largely dictate the optimum method of choice. An analysis of desired goals and outcomes is therefore critical in evaluating which is the best approach.

Local NCBER Census projects are unusual in that they have multi-dimensional goals:

  • Strategic Intelligence for Economic Development that includes valid, representative trends, plots and comparisons relating to business and economic development. This type of intelligence is used for a variety of purposes including policy formulation, business assistance and advocacy, resource allocation, bench marking and after-the-fact performance measures.

  • Real-Time Tactical Early Warning Alerts used by economic development professionals to assist companies at risk of relocating, downsizing or shutting down and thus keep jobs locally.
  • Creating real-time connections between economic development stakeholders and the local business community. This is accomplished by providing real-time, tactical intelligence on specific needs of local companies to banks, insurance, benefits, telecommunications and utility firms. The purpose is to increase both service levels to and competition for local companies - in particular as it relates to the huge small-to-medium enterprise market that is often neglected due to lack of good information.

To serve these goals, NCBER data collection must therefore be broad, deep and extremely current. Surveying the greatest number of businesses possible in a representative and valid fashion is thus of greater importance than achieving a true random sample that allows for the use of statistical techniques that are often of little applied value to the purposes of typical economic development. Even when disregarding the tactical dimensions of an NCBER project, sample representativeness is often a more important consideration than true randomization (which is better suited to research more experimental in nature and design, and where timeliness of the data is not an issue).

Additionally, due to selection bias, non-response, inaccurate or non-representative database source, attrition and other factors, studies originally designed as random sample surveys often do not qualify as random after the survey is complete. Due to a dependence on randomization to ensure the generalizability of survey results (rather than targeting individual strata or other techniques to ensure a representative sample), the final result of some "random sample" surveys are samples that are neither random nor representative.

Assuming a true random sample is achieved, what are the advantages and disadvantages of true random sampling over other techniques such as proportional quota sampling? The primary advantage is that statistical theory can be used to make generalizations from the sample of businesses to the entire population. However, sampling theory dictates that the more units sampled from a population, the closer the sample comes to approximating the characteristics of the population (assuming sampling is equivalent across various demographic dimensions, such as company size). When a large proportion of the business population is surveyed, the ability to make accurate statements about those businesses is largely equivalent to making statements about the population as a whole. Additionally, measures to ensure and gauge external validity can be utilized. For example, a comparison of NCBER data for a key factor such as employee size and revenues, with pre-existing revenue-per-employee for specific SIC codes can add confidence to the survey data.

In sum, as compared to a random sample survey, the primary advantage of our approach is a representative and valid sample containing a much higher number of responses, which in turn provides very significant tactical advantages not possible with a random sample survey.

Tracking of Stratification Parameters

As previously described, the sample is stratified in terms of industry, company size, and region. This is accomplished by means of real-time data tracking. The emerging sample is inspected daily, and resources are adjusted and reallocated to ensure representativeness in terms of each stratification parameter. For example, if an business database indicates that the Business Services industry accounts for roughly 8% of all businesses in a region, steps are taken to ensure this industry represents approximately 8% of all responses in the final survey data. Similarly, company size, and regional representation are tracked and adjusted. The survey is not considered complete until all stratification parameters fall within the predetermined ranges, and the overall required response rate has been achieved.

The NCBER dynamic targeting process is designed to adjust the parameters of the survey in real-time, as the survey is being conducted. The targeting program works by following the pre-set parameters to ensure the proper amounts and types of industries, company sizes and geographical areas are surveyed. These parameters operate at multiple levels simultaneously and the number of surveys required for each stratification layer or cell varies according to the fulfillment of certain preset requirements as well as the characteristics of the developing survey data.

Additionally, before the survey, targeting parameters can be manually adjusted to ensure that certain "VIP" businesses or other high-priority targets, whether they be industries, regions, areas or even specific firms or companies, are sampled in sufficient numbers.

Data Validation

Data is validated at three points during the survey process; during the live survey (real-time validation), spot checks of completed surveys (usually within 24 hours of the survey) and after the survey calls are complete (post-survey validation).

Real-time validation is conducted along several dimensions, the first being during keyboard entry. The NCBER CATI (Computer Aided Telephone Interview) system performs extensive logic and range checking. During the survey, NCBER surveyors are also subject to both visual (observations of screens) and audio ("silent" listening) monitoring. Experienced NCBER team-leaders, trained in management and leadership listen-in on randomly selected survey calls. This monitoring not only ensures that data is collected in a standardized and reliable manner, but also that surveyors are polite and professional when interviewing businesses. Additionally, NCBER survey team-leaders visually monitor the surveyors computer screens to track survey progress and method.

Also during the live-survey, survey statistics are monitored in real-time to track trends in the developing database. For each individual surveyor as well as at the aggregate level, these statistics document such factors as number of calls, average length of call, successful surveys, phone appointments, and various other factors crucial to tracking the live survey. This information is also used to target surveyors for observation.

In addition to the live-survey monitoring, daily tabulations of the survey data are conducted. At the end of each day of survey operations, a review and analysis of survey statistics and other statistical information collected to date is conducted. These "daily" statistics are used to analyze and interpret trends in key areas such as "representativeness" by size and industry. This information is used to devise strategy and targeting adjustments for the following days survey.

Also, daily validation calls are made to participants to spot-check data. These calls are made by experienced NCBER surveyors to randomly selected businesses. This not only provides a valuable validity check on the collected data, but it also provides a check on the accuracy of individual surveyors.

Post-Survey Validation

Once the calls are complete, the raw data is processed by a series of post-processing software programs developed specifically for the purpose. These programs flag (and eliminate if desired) data outliers, determine and flag data variability, create data aggregations and provide summary tables as well as data validation tabulations. Data points that are considered suspect or out-of-bounds according to a number of parameters are rechecked by calling the relevant business and asking validation questions. If the data in question is still suspect, the decision can be made to remove it from the data set Once all data runs are complete, the raw data is then processed using SPSS software (Statistical Package for the Social Sciences) macros to calculate a variety of statistical measures that we then compare to the aggregate results above. Tests include response distribution against the target population along geographical areas, industry clusters and business size. Based upon these data runs, we create a summary of our findings relating to data validity by industry, by area and by business size by area.

Lastly, the resulting data sets are evaluated against initial targeting parameters as agreed upon with the survey sponsor. In consultation with the sponsor, a determination is made as to whether the survey is complete, or whether to continue the survey to gather additional information.

Guidelines for Publication of Data

After the survey is completed, not all data will meet minimum requirements for publication. Although publication can take the form of an Internet application, or a written report, these guidelines remain the same.

  Learn About Our Technology

 

  Home | What We Do | About Us | Contact Us|      
 
  Copyright 2009 © NCBER, All Rights Reserved.