Risk Utility Assessment

The Risk Utility Assessment tab is a key part of a data asset. Information you provide here will help USAID determine the potential risk of sharing this data with various audiences, including the public.
Value of the Data
The first question asks you to describe the value of the data.  It reads:
Please briefly describe the value of this data in conducting analysis in support of international development programming. If applicable, please note the extent to which the data asset can substantially inform cross-sectional analysis (e.g., wide range of variables, detailed longitudinal series, rare and/or unique cases or variables)?
As you are answering this question, consider how your data can be used to supply development practitioners with actionable knowledge. Consider the following questions:
  • What problem is your data attempting to solve? Specify the setting to give future users more information about the context of the study.
  • What gap(s) does this data fill?  Did you survey unique sources or underrepresented populations?
  • How did you use the data? What analysis did you perform? What is the quality of the data you are submitting?
  • Give a temporal context- is this data about a study that is conducted repeatedly over a period of time?  If so, is the study that generated this data conducted on the same subjects (Longitudinal)?  Provide any details about the stage of data collection: baseline, interim, or rounds of data collection completed.
See the table below for an overview of considerations in describing the value of development data.
1.     Research question: What problem are we solving? What challenges did we face? Research conducted in conflict-affected area?
2.     Data description: What type of data was collected? Assess the quality of the data
3.     Data source: People? (Human-subject research). Objects? (Non-human subject research).
4.     Data Collection: Survey? Observation?
5.     Consumer: Who will benefit from this research? Who will likely be interested in this dataset for development or scientific research purpose?
6.     Value of Information: Does this move the development community closer to achieving a desired outcome? Could the results of this study help prevent an undesirable outcome?
De-identifying data
Please describe any efforts you have already taken to de-identify potentially sensitive data within this data asset. If this is captured in a document, please upload the document in the "Data Detail" tab under "Other Reference Materials."
De-identification is a method for removing personal information from the data that has been collected before it is submitted to the DDL. Removing unique individual identifiers is essential for minimizing the privacy risk associated with publishing survey or research data.
In this section, please provide enough information to potential researchers about any efforts that were taken to protect sensitive personal information.
Describe the method you chose to de-identify the dataset.  Following are some common de-identification methods:
  • Removing direct identifiers;
  • Replacing direct identifiers with random values (e.g. 9999, ABCDE) to preserve the form of the original data, while making re-association with individuals more difficult; and
  • Replacing direct identifiers with category names or data that are obviously generic, such as “NAME,” or “012 Any Lane, City, USA,” etc.
Risk Assessment
Does this data asset contain information about individuals?
The answer is “yes” if your research involves human beings.
Specify whether the study is about US citizens or non-US citizens.  Please check all that apply.
Be sure to specify if the study involves US citizens in some fashion.
Direct Identifiers
Does the data asset include data or information that relates specifically to an individual, or that can be used to uniquely identify an individual?
Direct identifiers are data that can be used to identify a person without additional information. Examples of direct identifiers include names, social security numbers, and email addresses. If not properly handled, they may seriously compromise the privacy, security, and confidentiality of individuals whose records appear in the dataset. USAID requires the removal of all direct identifiers before data are submitted to the DDL.
      The HIPAA Privacy Rule includes the following 18 specific data elements in its definition of direct identifiers:
1.     Names
2.     Email addresses
3.     All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and their equivalent geocodes
4.     Fax numbers
5.     Device identifiers and serial numbers
6.     Telephone numbers
7.     Vehicle identifiers and serial numbers, including license plate numbers
8.     Web Universal Locators (URLs)
9.     Internet Protocol (IP) addresses
10.  Social Security numbers (or other countries’ national IDs)
11.  Medical Records numbers
12.  Health Plan Beneficiary numbers
13.  Biometric identifiers, including fingerprints and voiceprints
14.  Full-face photographs and any comparable images
15.  Certificate/license numbers
16.  Any other unique identifying number or characteristic
17.  Account numbers
18.  All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89
Indirect Identifiers
Does the data asset include data or information that is linked to individuals or that could be used in combination with other information to identify individuals? (Yes/No)
What identifier or combination of identifiers are contained in the data asset that can be linked indirectly to an individual? (select all that apply)
Indirect identifiers are characteristics that, standing alone, do not identify a specific individual, but can, if used in combination with other information, identify an individual. Indirect identifiers can convey information that is very important to research studies in terms of demographics and other background information on study subjects and may prove valuable for later analysis. Removing them may reduce the usefulness of the dataset. Their sensitivity is assessed by determining the ability to use them in any combination to identify a unique individual.
Some of the most common indirect identifiers are background/demographic characteristics of people such as:
  • Age
  • Sex/Gender
  • Marital status
  • Race or ethnicity
  • Employment or educational attainment
Proposed Access Level and Rationale
There are three access levels you can propose while submitting data:
·       Public: Data asset is or could be made publicly available to all without restrictions.
·       Restricted Public: Data asset is available under certain use restrictions. One example, among many, is data that contain sufficient granularity or linkages that make it possible to re-identify individuals, even though the data asset is stripped of direct identifiers. Another example would be a data asset that contains sensitive information about non-human subjects that is only made available to select users under strong legal protections.  
·       Non-Public: Data asset is not available to members of the public. This category includes data that are only available for internal use by the Federal Government, such as by a single program, single agency, or across multiple agencies.
The rationale for restricting access to this data asset must be based on one of the following legal justifications:
  1. National Security Risk: Select this option if the information should be restricted to protect national security.
  2. Personal safety risk: Select this option if information contained in this data asset could reasonably be expected to endanger the life or physical safety of any individual.
  3. Risk to ongoing operations: Select this option if disclosure would interfere with USAID’s ability to effectively discharge its ongoing responsibilities in foreign assistance activities.
  4. Trade secrets/proprietary: Select this option if the information should remain confidential or privileged, because it reveals sensitive commercial or financial information; or if there are legal constraints on the disclosure of the business or proprietary information of non-governmental organizations, contractors, or private sector clients.
  5. Restricted via local law/bilateral agreement: Select this option when the laws or regulations of a host country apply to a bilateral agreement that restricts access to the information contained in this data asset.
  6. Personal Privacy risk: Select this option when the information, if disclosed, would invade an individual’s personal privacy; or if the information would reveal the identity of an individual that must be kept confidential consistent with ethical guidelines and federal regulations; or would cause an individual to suffer harm, such as embarrassment, discrimination, etc.
  7. Internal Personnel Rules: Select this option if the information reveals information related to the internal personnel rules and practices of USAID or other federal agency.
  8. Exempted by statute (not 552b): Select this option if the information is prohibited from disclosure by the Freedom of Information Act (FOIA), the Privacy Act, or another federal law.  
  9. Inter/Intra-agency memos: Select this option if the information contains privileged communications within or between agencies. These can include communications regarding the deliberative process, attorney-work privilege, or communications that are covered by attorney-client privilege.
  10. Law Enforcement Related: Select this option if the information was compiled for law enforcement purposes, and if released: could interfere with those purposes; would deprive an individual of due process; could disclose the identity of an informant; or could disclose law enforcement techniques.
  11. Financial Institution Regulation: Select this option if the information concerns the regulation or supervision of a financial institution.
  12. Geophysical on wells: Select this option if this information, if published, could reveal the geolocation on a well.
Proposed Access Level Comment
Here you can provide any additional information to support the designated access level. This section gives you an opportunity to explain the risks associated with making this data asset publicly available. Consider highlighting contextual information, such as: the research was conducted in a conflicted-afflicted area, or an area with state-sanctioned human rights violations; or the information may reveal the identities or cause discrimination against vulnerable populations. Be sure to indicate if disclosure or publication would violate third-party agreements.