Stella Biderman

机构: Booz Allen Hamilton, EleutherAI

主页: stellabiderman.com

每年引用次数

引用次数

引用: 10,013

H-指数: 30

I10-指数 : 36

出版物: 59

标题

引用次数

年份

What language model to train if you have one million GPU hours?

Teven Le Scao , Thomas Wang , Daniel Hesslow , Lucile Saulnier
arXiv preprint arXiv:2210.15424

2022

Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection

Mohammad Mahmudul Alam , Edward Raff , Stella R Biderman , Tim Oates
International Conference on Artificial Intelligence and Statistics 4042 -4050

2024

The Pile: An 800GB Dataset of Diverse Text for Language Modeling

Travis Hoppe , Jason Phang , Horace He , Stella Biderman
arXiv: Computation and Language

108

2020

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

Julia Kreutzer , Isaac Caswell , Lisa Wang , Ahsan Wahab
arXiv: Computation and Language

2021

Gpt-neox-20b: An open-source autoregressive language model

Sid Black , Stella Biderman , Eric Hallahan , Quentin Anthony
arXiv preprint arXiv:2204.06745

2022

Vqgan-clip: Open domain image generation and editing with natural language guidance

Katherine Crowson , Stella Biderman , Daniel Kornis , Dashiell Stander
Smpte Journal 88 -105

2022

Gpt-neo: Large scale autoregressive language modeling with mesh-tensorflow

Sid Black , Leo Gao , Phil Wang , Connor Leahy
If you use this software, please cite it using these metadata 58

252

2021

Bigbio: a framework for data-centric biomedical natural language processing

Jason Fries , Leon Weber , Natasha Seelam , Gabriel Altay
Advances in Neural Information Processing Systems 35 25792 -25806

2022

Data governance in the age of large-scale data-driven language technology

Yacine Jernite , Huu Nguyen , Stella Biderman , Anna Rogers
Smpte Journal 2206 -2222

2022

EleutherAI: Going Beyond" Open Science" to" Science in the Open"

Jason Phang , Herbie Bradley , Leo Gao , Louis Castricato
arXiv preprint arXiv:2210.06413

2022

Documenting geographically and contextually diverse data sources: The bigscience catalogue of language data and resources

Angelina McMillan-Major , Zaid Alyafeai , Stella Biderman , Kimbo Chen
arXiv preprint arXiv:2201.10066

2022

Datasheet for the pile

Stella Biderman , Kieran Bicheno , Leo Gao
arXiv preprint arXiv:2201.07311

2022

Cut the CARP: Fishing for zero-shot story evaluation

Shahbuland Matiana , JR Smith , Ryan Teehan , Louis Castricato
arXiv preprint arXiv:2110.03111

2021

OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization

Gustaf Ahdritz , Nazim Bouatta , Christina Floristean , Sachin Kadyan
bioRxiv 2022 -11

2022

Fooling moss detection with pretrained language models

Stella Biderman , Edward Raff
Smpte Journal 2933 -2943

2022

Llemma: An open language model for mathematics

Zhangir Azerbayev , Hailey Schoelkopf , Keiran Paster , Marco Dos Santos
arXiv preprint arXiv:2310.10631

2023

Crosslingual Generalization through Multitask Finetuning

Niklas Muennighoff , Thomas Wang , Lintang Sutawika , Adam Roberts
arXiv preprint arXiv:2211.01786

409

2022

You reap what you sow: On the challenges of bias evaluation under multilingual settings

Zeerak Talat , Aurélie Névéol , Stella Biderman , Miruna Clinciu
Proceedings of BigScience Episode# 5--Workshop on Challenges & Perspectives in Creating Large Language Models 26 -41

2022

Bloom: A 176b-parameter open-access multilingual language model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick

1,242

2023

BLOOM+ 1: Adding Language Support to BLOOM for Zero-Shot Prompting

Zheng-Xin Yong , Hailey Schoelkopf , Niklas Muennighoff , Alham Fikri Aji
arXiv preprint arXiv:2212.09535

2022

Natural Language Processing

Artificial Intelligence

Language Modeling

Deep Learning

Edward Raff Quentin Anthony Hailey Schoelkopf Lintang Sutawika

查看全部合作者

Stella Biderman

引用次数

出版物: 59

我的账户