ML Engineer / Research Scientist

Minuteman Group, Inc.

Yesterday
Secret
Senior Level Career (10+ yrs experience)
IT - Software
Lexington, MA (On/Off-Site)

Job Title: Software Developer (Hybrid)
Location: Lexington, MA
Job Type: W2 Contract

Background/Need:
The Group focuses on objective, technology-based human centered solutions to measure, model and modify cognitive and physiological functions for performance, enhancement, sustainment, or recovery. In keeping with its human-focused mission, work in the group includes Human Language Technology (HLT), focusing on processing both speech and text for speech for a variety of HLT applications, including foreign language training, technology evaluation, speaker and language recognition, word and topic spotting, speech coding, speech and audio signal enhancement, machine translation, and information extraction. Work also includes areas focused on Generative AI, leveraging the latest in large language models and transformer-based capabilities.
The Group is in need of a Software Developer -- Specialization in Multilingual Research Data Development to:
1. Collect and analyze text, audio and physiological data.
2. Design and create multilingual databases.
3. Author study protocols and successfully apply to Human Subject Review (HSR) boards for approval.
4. Design and implement audio QA/QC tools, procedures, and best practices.
5. Maintain and manage an audio facility including computer systems maintenance; hardware support; interaction with vendors.
6. Create and maintain Data Security Plans (DSP) and Loan agreements.
7. Implement Natural Language Processing (NLP) and Machine Learning (ML) tools and techniques for HLT evaluation and applications.


Responsibilities:
1. HLT Data Collection: Implement Natural Language Processing (NLP) and Machine Learning (ML) tools and techniques to create and enhance data for Human Language Technology (HLT) evaluation and applications. Use the DOMINO workflow system for creating interactive foreign language training and testing materials.
2. Human Language Technology Evaluation: Ability to identify and apply benchmarks to evaluate AI model performance. Ability to create custom multi-lingual datasets for the evaluation of machine translation (MT) and automatic speech recognition (ASR) systems.
3. Generative AI: Expertise in Large Language Models (LLMs), generative AI to operational systems. Skills include programming abilities: llama-cpp, mistral, GPT4All, Chat-GPT, Orca, and transformer-based capabilities generally.
4. For Audio/Speech QA/QC: define, create and implement audio applications for measurement and enhancement of audio and speech recording quality; assessing speech corpora integrity; coordinating with and providing guidance to subcontractors providing speech corpora.
5. Advanced audio data analysis: Ability to design, implement and confirm the performance of an audio data collection method for the speech intelligibility evaluation of wearable acoustic sensors.
6. Human Subjects Protocols: Design, author and implement study protocols for the collection of multilingual speech and multi-modal databases. Submit to and maintain protocols with HSR boards and US DoD Human Research Protection agencies.
7. Manage Data Collection Equipment and Facility: Maintain and Manage the Group sound room facility; specifying equipment needs, coordinating efforts across multiple Groups, creating calibrated
acoustic noise simulations, Implementing Study Protocols for collecting multi-modal data from human
subjects; author and implement the procedures necessary to provide and preserve the capability to perform in-field speech and acoustic noise data collections and speech communication.
8. Laboratory Facilities: Ability to work closely with the Facilities division to design and specify new laboratory spaces. Ability to interface with the technical team, understand how the laboratory spaces will need to be designed to address technical needs, and communicate the design to address technical needs, and communicate the design specifications to the Facilities division.

Must Have:
1. HLT Research Experience: Experience with Java, Python, MATLAB, git, Digital Audio Workstation (DAW) such as Adobe Audition, Audacity, SoX, Sound Exchange, etc.; must include experience using machine learning techniques and natural language processing tools to create HLT data sets. Familiarity with foreign language corpus development is required for this work. Requires experience designing crowdsourcing jobs for text annotation; experience with JSON and SQL Databases. Experience directing subject matter experts to create interactive foreign language training and testing materials.
2. Human Subjects Experience: Authoring Study Protocols and successfully submitting them for approval Human Subject Review Boards for the purpose of multi-sensor data collections and language-learning systems performance. Demonstrated ability to train new personnel in implementing human subjects data collection protocols is required.
3. Sound Room Management: Specify and Maintain equipment. Data Collection Hardware: MacOS and MS Windows platforms, professional audio interfaces, loudspeaker playback systems, audio microphone and multi-modal sensors (heart rate, skin conductance, etc.) data collection systems. National Instruments data collection systems; Portable audio recording systems and Sound Pressure Level (SPL) meters. Demonstrated ability to author and maintain Data Security Plans and Loan Agreements for off-site equipment. Solid understanding of audio equipment usage is required.
4. Independence and Reliability: Demonstrated ability to work independently to complete complex projects on a tight schedule; Requires strong communication skills, interacting with various groups, human subjects, and subcontractors. Demonstrated ability to lead and coordinate teams to produce deliverables on tight deadlines.
5. HLT/Machine Learning: Demonstrated experience implementing Machine Learning and in Human Language Technology / Natural Language Processing Tools and Services.
6. Software dev-ops: Demonstrated ability to work in agile development cycle including issues, projects, pull request review, UI and unit testing, Jenkins build, Artifactory storage and deployment.


Nice to Have:
1. Experience with digital signal processing.
2. Experience in Digital Speech Communication Test and Evaluation.
3. Experience with JSON and SQL Databases.
4. Experience in digital speech communication test and evaluation.
5. Experience in extracting and analyzing data from social media platforms.



Education & Experience:10+ years of relevant work experience.
Work Authorization: US Citizenship is required due to the nature of the work.
Clearance: Active Secret Clearance.



"We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disabil
group id: EMCON
N
Name HiddenSr. Talent Acquisition Partner

Match Score

Powered by IntelliSearchâ„¢
Create an account or Login to see how closely you match to this job!