The impact of evaluation scenario development on the quantitative performance of speech translation systems prescribed by the SCORE framework

关键词:

摘要: The Defense Advanced Research Projects Agency's (DARPA) Spoken Language Communication and Translation for Tactical Use (TRANSTAC) program is a focused advanced technology research development program. intent of this to demonstrate capabilities quickly develop implement free-form, two-way, speech-to-speech spoken language translation systems allowing speakers different languages communicate with each other in real-world tactical situations without the need an interpreter. National Institute Standards Technology (NIST), support from Mitre Corporation Appen Pty Limited, has been funded by DARPA evaluate TRANSTAC technologies since 2006. NIST-led Independent Evaluation Team (IET) numerous responsibilities ongoing effort including collecting processing training data, designing implementing performance evaluations, analyzing test data. In order design execute fair relevant NIST IET employed System, Component Operationally-Relevant (SCORE) framework. SCORE framework unified set criteria tools built around premise that, gain understanding how would perform its intended environment, it must be evaluated at both component system levels further tested operationally-relevant environments while capturing quantitative qualitative Since evaluation goal capture data technologies, developed implemented SCORE-inspired live scenarios. two forms scenarios have unique impacts on This paper presents methodology, as well their influence performance.

acm.org 本地加速

doi.org 本地加速

sci-hub.se PDF 下载加速

参考文章(10)

Michelle Potts Steves, Craig Schlenoff, Sherri L. Condon, Dan Parvaz, Brian A. Weiss, Jon Phillips, Gregory A. Sanders, Performance Evaluation of Speech Translation Systems. language resources and evaluation. ,(2008)

Brian A. Weiss, Marnie Menzel, Development of Domain-Specific Scenarios for Training and Evaluation of Two-Way, Free Form, Spoken Language Translation Devices International Test and Evaluation Association (ITEA) Journal. ,(2010)

S. Balakirsky, S. Carpin, G. Dimitoglou, B. Balaguer, From Simulation to Real Robots with Predictable Results: Methods and Examples Performance Evaluation and Benchmarking of Intelligent Systems. pp. 113- 137 ,(2009) , 10.1007/978-1-4419-0492-8_6

Craig Schlenoff, Michelle Potts Steves, Brian A. Weiss, Mike Shneier, Ann Virts, Applying SCORE to field‐based performance evaluations of soldier worn sensor technologies Journal of Field Robotics. ,vol. 24, pp. 671- 698 ,(2007) , 10.1002/ROB.20211

Stephen Balakirsky, Raj Madhavan, Advancing manufacturing research through competitions Unmanned Systems Technology XI. ,vol. 7332, ,(2009) , 10.1117/12.823719

Craig Schlenoff, Greg Sanders, Brian Weiss, Fred Proctor, Michelle Potts Steves, Ann Virts, Evaluating speech translation systems Proceedings of the 9th Workshop on Performance Metrics for Intelligent Systems - PerMIS '09. pp. 223- 230 ,(2009) , 10.1145/1865909.1865955

Brian A. Weiss, Craig Schlenoff, Evolution of the SCORE framework to enhance field-based performance evaluations of emerging technologies Proceedings of the 8th Workshop on Performance Metrics for Intelligent Systems - PerMIS '08. pp. 1- 8 ,(2008) , 10.1145/1774674.1774676

Michelle Potts Steves, Craig Schlenoff, Ann Virts, Brian A. Weiss, Mike Shneier, Applying SCORE to field-based performance evaluations of soldier worn sensor technologies: Field Reports Journal of Field Robotics. ,vol. 24, pp. 671- 698 ,(2007) , 10.1002/ROB.V24:8/9

Craig I. Schlenoff, Ann M. Virts, Brian A. Weiss, Michael O. Shneier, Technology Evaluations and Performance Metrics for Soldier-Worn Sensors for ASSIST | NIST performance metrics for intelligent systems. ,(2007)

10.

Craig I. Schlenoff, Ann M. Virts, Brian A. Weiss, Michael Linegang, Michael O. Shneier, Michelle P. Steves, Overview of the First Advanced Technology Evaluations for ASSIST | NIST performance metrics for intelligent systems. ,(2006)

The impact of evaluation scenario development on the quantitative performance of speech translation systems prescribed by the SCORE framework

来源期刊

我的账户

The impact of evaluation scenario development on the quantitative performance of speech translation systems prescribed by the SCORE framework

来源期刊

相似文章 4

Multi-Relationship Evaluation Design (MRED): An Interactive Test Plan Designer for Advanced and Emerging Technologies

The multi-relationship evaluation design framework: creating evaluation blueprints to assess advanced and intelligent technologies

Performance assessments of Android-powered military applications operating on tactical handheld devices

Evaluation methodology and metrics employed to assess the TRANSTAC two-way, speech-to-speech translation systems

我的账户