OVERVIEW OF THE HISTORY OF
THE ILR LANGUAGE PROFICIENCY SKILL LEVEL
DESCRIPTIONS AND SCALE
Dr. Martha Herzog
HOW DID THE LANGUAGE PROFICIENCY SCALE GET STARTED?
The United States has traditionally had special problems defining foreign
language competence because of the historic inattention to languages
in our general educational programs. Faced with academic gaps, the Government
has had to fill them for Government purposes. Fortunately, some of the
lessons learned by the Government have been used by others.
The foreign language competence
of U. S. Government employees was not examined during the first 175
years of our history. However, in the 1950s,
as a war with Japan was followed by a war in Korea, the United States’ lack
of preparation in foreign languages was recognized as a serious problem.
In 1952 the Civil Service Commission was directed to inventory the language
ability of Government employees and develop a register of these employees’ language
skills, background, and experience.
Unfortunately, the Commission
had no system for conducting an inventory, no proficiency test, and
no criteria for test construction. Available,
instead, were employees’ grades in language courses and self-reports
on job applications. Self-reports were likely to state something like “fluent
in French” or “excellent German,” and there has never
been standardized grading across academic institutions in this country.
The Commission concluded that the United States Government needed a system
that was objective, applicable to all languages and all Civil Service
positions, and unrelated to any particular language curriculum. Because
the academic community did not have such a system, the Government had
to develop its own.
Initially the concept met
resistance. Some Government agencies feared loss of autonomy, and everyone
understood that test results could embarrass
many employees who claimed to be “fluent” or “excellent.”
Nevertheless, the Foreign
Service Institute (FSI) began to work on solving the problem under
the leadership of their Dean, Dr. Henry Lee
Smith. He headed an interagency committee that devised a single scale
ranging from 1 to 6; that is, the first scale did not distinguish among
the four skills but simply rated “language.” Although other
government agencies lost interest for a time, FSI continued to refine
In 1955 a survey of all Foreign
Service officers based on the new scale showed that fewer than half
reported a level of language “useful
to the service.” The extent of the problem was further highlighted
in 1956, when only 25% of entering Foreign Service Officers were tested
at a “useful” level of proficiency in any foreign language.
In November of 1956, the Secretary of State announced a new language
policy, including the requirement that language ability “will be
verified by tests.” In 1958, language proficiency tests became “mandatory” for
all Foreign Service Officers.
FSI’s first efforts to test according to the scale were not reliable.
The faculty found it difficult to apply the scale consistently, so results
varied from tester to tester. Tests were considered subjective and thought
to be much easier in some languages than others. However, many valuable
lessons were learned from initial tests. FSI built upon this experience
to revise the scale. One extremely important decision involved changing
the single scale for “language” to separate scales for each
skill. The scale was eventually standardized to six base levels, ranging
from 0 (= no functional ability) to 5 (= equivalent to an educated native
Equally important was the
creation in 1958 of an independent testing office at FSI headed by
Frank Rice and Claudia Wilds, who had studied
with John B. Carroll. Professor Carroll, then at Harvard, served as a
consultant as the test was designed. The FSI Testing Unit developed a
structured interview in direct support of the 6 point scale. Standardized
factors were developed for scoring, and the interview format ensured
that all factors were tested. The interaction of test format and rating
factors was crucial to the success of the test. Emphasis on a well-structured
interview reduced the problems associated with the earlier tests. The
development of standardized rating factors reduced subjectivity. The
factors provided a basis for testers’ agreement on important aspects
of test performance and helped to focus their attention during testing
and rating. This innovation created the framework for checking interrater
reliability, and a high degree of consistency in scoring resulted.
The interview soon became
the standard method of testing at FSI. For many years it was known
world-wide as the FSI interview, or just “the
FSI.” The interview and the scale gained wide recognition, and
many other Government agencies adopted the system, including the Peace
Corps for the testing of all its overseas volunteers. In 1968 several
agencies cooperatively wrote formal descriptions of the base levels in
four skills—speaking, listening, reading, and writing. The resulting
scale became part of the United States Government Personnel Manual. The
original challenge to inventory Government employees’ language
ability could finally be met.
New developments continued.
In 1976 NATO adopted a language proficiency scale related to the 1968
document. By 1985 the U. S. document had been
revised under the auspices of the Interagency Language Roundtable (ILR)
to include full descriptions of the “plus” levels that had
gradually been incorporated into the scoring system. (Since then, the
official Government Language Skill Level Descriptions have been known
as the “ILR Scale” or the “ILR Definitions.”)
Although specific testing tasks and procedures now differ somewhat from
one agency to another for operational reasons, all U.S. Government agencies
adhere to the ILR Definitions as the standard measuring stick of language
Also in the 1980s, the American Council on the Teaching of Foreign Languages
(ACTFL) developed and published for academic use Proficiency Guidelines
based on the ILR definitions. Like the ILR scale, the ACTFL guidelines
have undergone refinement.. ACTFL also developed an OPI similar to the
Government test and began training educators to test according to their
scale. ACTFL and the Government have worked together closely for almost
twenty years to ensure that the two proficiency testing systems are complementary.
Adams, M. L. (1980a). Five concurring factors in speaking proficiency.
In: Reference papers compiled for testing kit workshop III (pp. 46-51).
School of Language Studies, Foreign Service Institute.
Adams, M. L. (1980b). Measuring foreign language speaking proficiency:
A study of agreement among raters. In: Reference papers compiled for
testing kit workshop III (pp. 13-36). School of Language Studies, Foreign
Lowe Jr, P. (1988). The unassimilated
history. In: P. Lowe Jr. & C.
W. Stansfield (Eds.), Second language proficiency assessment: Current
issues (pp. 11-51). Englewood Cliffs, NJ: Prentice Hall Regents.
Sollenberger, H. E. (1978).
Development and current use of the FSI oral interview test. In: J.
L. D. Clark (Ed.), Direct testing of speaking
proficiency—Theory and practice (pp. 3-12). Princeton, NJ: Educational
Wilds, C. P. (1975). The oral
interview test. In: R. Jones & B.
Spolsky (Eds.), Testing language proficiency (pp. 29-44). Arlington,
VA: Center for Applied Linguistics.
Wilds, C. P. (1978). The measurement of speaking and reading proficiency
in a foreign language. In: M. L. Adams and J.R. Frith (Eds.), Testing
kit (pp. 1-12). School of Language Studies, Department of State. U.S.
Government Printing Office.