NSF REU Site: Trustworthy AI
The rapid development of Artificial Intelligence (AI) continues to produce significant economic and social benefits.
With the widespread use of AI in areas such as transportation, finance, medicine, security, and entertainment, there is a rising societal awareness that these systems must be made trustworthy. An improperly developed AI system can lead to many undesirable outcomes ranging from biased treatments in hiring and loan decisions, to the wrongful release of prisoners, to the loss of human life. If such negative outcomes persist, human users will eventually lose trust in AI systems.
To face this challenge, AI trustworthiness has been the focus of federal government as well as institutions and academic initiatives like the AI Now Institute and the ACM Conference on Fairness, Accountability and Transparency. In response, the goal of this project at Rochester Institute of Technology (RIT) is to encourage talented undergraduate students to pursue graduate study and research careers by engaging them in exciting and meaningful research experiences in developing trustworthy AI and by cultivating their talents during their summer experiences and beyond.
This NSF REU at RIT will pursue three objectives to address the project goal:
- Engage a total of 10 students annually from traditionally underrepresented groups or colleges and universities with limited research opportunities and immerse students in ongoing research projects in the development of trustworthy AI for social good,
- Cultivate talented students to effectively plan, conduct, and communicate scientific research through meaningful and engaging research projects, close and effective mentoring, weekly group meetings, mentor training, and public presentations, and
- Improve educational pathways to advanced trustworthy AI research and development careers through student involvement in expert speaker series and additional professional development activities.
Schedule
Monday | Tuesday | Wednesday | Thursday | Friday | |
---|---|---|---|---|---|
Week 0 | Mentor Training Workshop | ||||
Week 1 | 1. Orientation 2. Cookout |
Crash Course (Day 1) |
Crash Course (Day 2) |
Crash Course (Day 3) |
Research Methods Seminar Group Meeting #1: 12pm-1:30pm Project Overview |
Week 2 | Crash Course (Day 4) |
Crash Course (Day 5) |
Managing Faculty Mentor | Ph.D. Pathways Panel | Group Meeting #2: Research Proposal |
Week 3 | Grad School Application | Crafting Personal Statement | Group Meeting #3: Research Proposal |
||
Week 4 | Invited Talk #1 | Group Meeting #4: Research Progress |
|||
Week 5 | Mid-program evaluation | Invited Talk #2 | Group Meeting #5: Research Progress |
||
Week 6 | Group Meeting #6: Research Progress |
||||
Week 7 | Group Meeting #7: Research Progress |
||||
Week 8 | Group Meeting #8: Research Progress |
||||
Week 9 | Polishing Presentation | Public Speaking Tips | Group Meeting #9: Research Presentation |
||
Week 10 | Presentation Practice | Presentation Practice | Presentation Practice | Final Program Evaluation | 1. Poster & Presentation Sessions at AURS; 2. Farewell Party |
Research Themes and Projects
User authentication verifies that someone who is attempting to access services and applications is who they claim to be. There are three broad categories of approaches to authenticating a user. The first category is based on What you know, such as the username and password or answers to security questions. This is also known as Knowledge-based Authentication (KBA). The second is based on What you have, i.e., a physical object that one can possess, e.g., a security badge or a security key such as a Yubikey. The third category is based on What you are, which can be either physiological, e.g., face, fingerprint, or iris; or behavioral, e.g., keystroke dynamics, mouse dynamics, mobile motion, or swipes. While KBA currently is still the dominant approach for identity management, there are active efforts in creating alternatives such as face recognition and FIDO. Behavioral biometrics is emerging as a new field of user authentication, which authenticates a user based on their behavior when interacting with a computing device (What you are).
Within this project theme, students will have opportunities to engage in activities that encompass the entire data life cycle in behavioral biometrics. These activities include designing data collection, authoring IRB protocols, managing data storage and security, preprocessing data, and performing visualization, analysis, and inference. Additionally, students will learn about performance measurement techniques such as ROC curves, EER, and FAR/FRR, and will be involved in reporting their findings. Students will have the options to establish a project with one or more modalities, such as keystroke data [68, 97, 101] or mobile behaviors [83]. Through hands-on activities, they will understand the differences between anomaly detection, binary classifiers, and deep-learning embedding models and be able to create such models.
As one of the important requirements in trustworthy AI, it is crucial to ensure that the AI system is ethically fair towards different groups of people. An AI system can exhibit different types of biases. (1) It can inherit the bias from its training data labels, e.g., an AI system will predict that all female applicants should not be hired if it learns from biased hiring decisions where no female applicants (even when they were qualified) were hired historically. (2) It can also favor one demographic group over another by generating more accurate or positive predictions on data from that group. E.g. it has been found that, in 2020, the face recognition software from large companies including Amazon, Microsoft, IBM, etc. predict in significantly lower accuracy (20 − 30%) for darker female faces than for lighter male faces [70]. Another example is the COMPAS analysis [20] where a machine learning system has similar accuracy across black and white groups in predicting whether a defendant will re-offend in two years. However, the system has around 20% higher false positive rate and around 20% lower false negative rate in predicting a black defendant will re-offend— i.e. more black defendants were wrongly predicted as higher risk when they actually won’t and more white defendants were wrongly predicted as lower risk when they will re-offend. Co-PI Yu has rich research experience in AI fairness [25, 108]. His ongoing research (funded by his NSF CRII award) in algorithmic fairness of regression problems and the estimation of relative human bias with machine learning models will provide cutting-edge research experience to the REU students.
From tracking illegal wildlife poaching activities [104] to detecting peace-seeking, and hostility diffusing social web discussions amid warlike situations between nuclear adversaries [56, 74] – AI for social good has diverse applications. The projects in the AI for social good theme will have two key thrusts: (1) answering public policy questions using data; and (2) evaluating equitability and fairness in AI systems. Most projects will focus on natural language processing with a few projects also considering multimodal data (e.g., text and images, or text and videos). Potential projects will draw from extensive data science for social good research led by Mentor KhudaBukhsh [26, 31–33, 52, 53, 56, 57, 75, 79, 107] and will emphasize SDG goals 5 (gender equality), 13 (climate action), and 16 (peace, justice, and strong institutions). Through these projects, students will learn (1) how to navigate large data sets; (2) how to formulate a meaningful research question that can be answered from a large dataset; (3) how to design large language model prompts to mine stance; (4) how to design meaningful contrastive studies to investigate gender and social bias (Sample Project 2); (5) how to test the effectiveness of their experimental design; and (6) how to interpret the results and test robustness of the findings. Students who demonstrate strong progress will be encouraged to extend their projects to multimodal data (thrust I) and diverse data sources (thrust II). Projects with substantial novel contributions will be encouraged to be submitted as conference papers.
While a strong foundation in AI/ML is essential for students, it is equally crucial to equip them with the knowledge and skills to critically evaluate, stress test, and secure the AI/ML pipeline. This topic is crucial due to the increasing complexity of threats faced by ML systems. E.g., compromised diagnostic models in healthcare have led to misdiagnoses [88], poisoned data has undermined fraud detection in financial services [96], and a significant risk in the software supply chain has allowed malicious code and Trojans to slide into legitimate applications [28, 67]. Students should know model building but also how to break them, identify vulnerabilities, and assess the integrity of the entire process. REU students will conduct hands-on research projects to expose them to the various techniques for identifying and mitigating the most prominent potential weaknesses in the pipeline.
Required Application Materials
- Undergraduate transcripts. Unofficial copies can be accepted.
- A personal statement that justifies why you are a good fit for this REU Site program.
- Two reference letters, to be submitted directly from by the letter writers to the ETAP website
- Identify a preferred research project and indicate whether you are willing to work on a project other than your preferred one.
Application Materials must be submitted via NSF’s ETAP website.
Admission Process and Criteria
We anticipate that most successful applicants will be rising juniors and seniors majoring in a computing or engineering discipline, but we also plan to accept truly exceptional rising sophomores. Selection will be based on undergraduate transcripts, a personal statement, and two reference letters. We consider each applicant’s academic background (a minimal GPA of 3.0/4 from a related major), demographic information, project selections, and personal statement. Successful applicants should have had at least one year of education in some combination of calculus, linear algebra, and probability and statistics that some faculty mentors deemed adequate for their research projects; programming experience with Java, C++, or Python. Knowledge of data structures, algorithms, and systems is preferred but not required. We may consult with mentors or reference letter writers to further evaluate the qualification of an applicant. All offers will be subject to confirmation of NSF’s citizenship eligibility requirements.
Summary of Benefits
Admitted REU students will receive a stipend of $700/week for 10 weeks, totaling $7,000; a meal allowance of $100/week for 10 weeks, totaling $1,000. The REU Site will also provide free housing and reimburse each student up to $500 to support a round-trip between their home and RIT. Other benefits include mentored research experience, weekly research group meetings, a variety of professional development activities, research seminar by invited AI experts, and attendance and presentation in RIT’s 34th Undergraduate Research Symposium on July 31, 2025.