EVALUATIONS AND BENCHMARKS FOR ASSESSING AI RESEARCH AND DEVELOPMENT CAPABILITIES OF AI MODELS
Looking for contract opportunity help?
APEX Accelerators are an official government contracting resource for small businesses. Find your local APEX Accelerator (opens in new window) for free government expertise related to contract opportunities.
APEX Accelerators are funded in part through a cooperative agreement with the Department of Defense.
The APEX Accelerators program was formerly known as the Procurement Technical Assistance Program (opens in new window) (PTAP).
General Information
- Contract Opportunity Type: Sources Sought (Original)
- Original Published Date: Mar 05, 2025 05:08 pm EST
- Original Response Date: Mar 15, 2025 03:30 pm EDT
- Inactive Policy: 15 days after response date
- Original Inactive Date: Mar 30, 2025
- Initiative:
- None
Classification
- Original Set Aside:
- Product Service Code:
- NAICS Code:
- Place of Performance:
Description
This is a SOURCES SOUGHT NOTICE for market research purposes. THIS IS NOT A REQUEST FOR PROPOSALS OR A REQUEST FOR QUOTATIONS.
Artificial Intelligence (Al) is one of the defining technologies of our era. Its emergence, together with its multiplying contexts of use and increasing capabilities, presents enormous opportunities.
To enable the U.S. economy to harness the full benefits of AI, the National Institute of Standards and Technology (NIST) focuses on fundamental research and improving AI measurement science, technology, standards and related tools.[1]
The U.S. AI Safety Institute (AISI), housed within the National Institute of Standards and Technology, is developing testing, evaluations, and guidelines to help accelerate trustworthy AI innovation in the United States and around the world, with a focus on promoting measurement science for AI capabilities and helping to prevent flaws or misuses of AI technology that could undermine public safety or national security.[2]
As part of this work, AISI is conducting testing, evaluation, validation, and verification (TEVV) on high-impact frontier models’ capabilities. In doing so, AISI seeks to ensure that its projects, evaluations, and tools reflect the best available science, and to coordinate closely with a diverse set of Al stakeholders who are developing and conducting evaluations to assess capabilities, functionality, and risks.
NIST is performing market research to identify potential sources for an anticipated contract to assist in developing evaluations and benchmarks of AI models' relevant software engineering and AI research and development capabilities, functionality, and risks.
POTENTIAL CONTRACT REQUIREMENTS
The following requirements are provided to describe the Government's minimum needs. As the acquisition planning process proceeds, requirements may be modified.
The Contractor must provide or develop resources for various aspects of assessing the capability of frontier Al models to assist in software engineering and AI research and development, including by assessing the quality or functionality of AI-generated outputs in these domains and any corresponding risks. The Contractor would be responsible to conduct one or more of the tasks in list A in order to assess one or more of the capabilities in list B.
Contractor Tasks:
Contractors must provide or develop resources for one or more of the following:
- Developing benchmarks and scoring mechanisms for automated evaluation of Al models' relevant capabilities;
- Developing tasks for automated evaluation of AI models’ relevant capabilities with accompanying data on human baseline performance (e.g., how long the tasks take human experts to complete);
- Design and implementation of protocols or methods for evaluating AI models’ relevant capabilities;
- Development of technical resources or infrastructure for generating tasks and environments and scoring mechanisms for evaluating AI models’ relevant capabilities. Relevant Model Capabilities:
Relevant frontier model capabilities to elicit, evaluate, and benchmark include:
- Capabilities that enable a model to assist with or automate software development activities such as designing and implementing projects based on specifications, identifying and correcting bugs, updating and refactoring code, or deploying code;
- Capabilities that enable a model to assist with or automate research activities associated with frontier AI model development, such as the ability to generate and test hypotheses relating to the design of AI models or to perform iterative experimentation;
- Capabilities that enable a model to assist with or automate engineering and infrastructure management activities associated with frontier AI model development, such as managing large-scale model training processes;
- Capabilities that enable a model to subvert or interfere with software or AI development or usage processes, such as by using sabotage, deception, or self-exfiltration;
- Other capabilities that are relevant to significantly assisting with or automating processes of software development or AI research and development, or that could create risks related to such processes.
INFORMATION FOR SUBMISSION
NIST is seeking responses from responsible sources. Responses are being sought from all business sizes.
After the results of this market research are obtained and analyzed, NIST may conduct a competitive procurement and subsequently award a contract.
The NAICS code for this effort is 541990 Professional, Scientific and Technical Services. The small business size standard is $19.0M. If a small business set-aside results, a Limitation on Subcontracting clause will be applicable to the anticipated procurement action. Businesses meeting the classification of the set-aside cannot pay more than 50% of the amount paid to it, by the Government, to firms that are not similarly situated.
Responses to this sources sought shall not exceed 15 pages in length and must demonstrate the respondent's capabilities related to the potential contract requirements; responses must be detailed beyond a standard capability statement.
The following information must be provided in response to this sources sought notice:
- Name of company(ies), their addresses, Unique Entity Identifier (UEI) for the company's active System for Award Management (SAM.gov) website registration, and a point of contact for the company (name, phone number, fax number and email address) that provide the services for which information is provided.
Interested parties that do not have an active SAM.gov registration are strongly encouraged to immediately begin the registration process. This process can take several weeks to complete. Parties responding to Government solicitations must have an active registration at SAM.Gov in order for a quotation or proposal to be considered for award.
2. Respondents must possess knowledge and capacity in some or all of the tasks (i)-(iv) included in this sources sought, specifically as they pertain to assessing the relevant model capabilities listed in above. For each applicable task area A(i)-(iv), Respondents must provide discussion of their knowledge and capacity.
Knowledge and capacity in using AI systems for related applications or in assessing AI systems for procurement purposes is not considered applicable.
3. In addition to having knowledge and capacity in some or all of the areas above, Respondents must have:
(1) Previous experience executing on one or more of the actions and activities described above, specifically as they pertain to assessing the capabilities and/or risks of frontier Al models.
(2) Technical staff with 3 – 5 years of experience working in artificial intelligence.
(3) Potential labor categories, that may be used to accomplish the work. If the Contractor has performed similar work, please provide price information for the work performed. All price information will be utilized for budgeting purposes All price information will be held by NIST as confidential.
(4) Any other relevant information which the Government should consider in developing its minimum specifications and finalizing its market research.
The above information and any other information considered pertinent to this notification must be submitted to Carol Wood, Contracting Officer, not later than March 15, 2025, at 3:30 PM Eastern Time. Email submission is required.
[1] https://www.nist.gov/artificial-intelligence
[2] https://www.nist.gov/aisi
Attachments/Links
Contact Information
Primary Point of Contact
- Carol A. Wood
- carol.wood@nist.gov
- Phone Number 3019758172
- Fax Number 3019756273