Shuming Ma — Pith Author Registry

Identifiers

name variant Shuming Ma 0.60 · backfill

Papers (30)

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits cs.CL · 2024 · author #1
BitNet: Scaling 1-bit Transformers for Large Language Models cs.CL · 2023 · author #2
Retentive Network: A Successor to Transformer for Large Language Models cs.CL · 2023 · author #4
Kosmos-2: Grounding Multimodal Large Language Models to the World cs.CL · 2023 · author #6
Language Is Not All You Need: Aligning Perception with Language Models cs.CL · 2023 · author #6
Unsupervised Machine Commenting with Neural Variational Topic Model cs.CL · 2018 · author #1
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts cs.CL · 2018 · author #1
A Deep Reinforced Sequence-to-Set Model for Multi-Label Text Classification cs.CL · 2018 · author #2
Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification cs.CL · 2018 · author #4
Identifying High-Quality Chinese News Comments Based on Multi-Target Text Matching Model cs.CL · 2018 · author #2
SGM: Sequence Generation Model for Multi-label Classification cs.CL · 2018 · author #4
Deconvolution-Based Global Decoding for Neural Machine Translation cs.CL · 2018 · author #4
Bag-of-Words as Target for Neural Machine Translation cs.CL · 2018 · author #1
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization cs.CL · 2018 · author #1
Global Encoding for Abstractive Summarization cs.CL · 2018 · author #3
Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network cs.CL · 2018 · author #4
A Hierarchical End-to-End Model for Jointly Improving Text Summarization and Sentiment Classification cs.CL · 2018 · author #1
Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation cs.CL · 2018 · author #1
Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation cs.CL · 2018 · author #2
Complex Structure Leads to Overfitting: A Structure Regularization Decoding Method for Natural Language Processing cs.LG · 2017 · author #3
Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data? cs.CL · 2017 · author #3
Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method cs.LG · 2017 · author #3
Label Embedding Network: Learning Label Representation for Soft Training of Deep Networks cs.LG · 2017 · author #4
A Semantic Relevance Based Neural Network for Text Summarization and Text Simplification cs.CL · 2017 · author #1
meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting cs.LG · 2017 · author #3
Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization cs.CL · 2017 · author #1
A Generic Online Parallel Learning Framework for Large Margin Models cs.CL · 2017 · author #1
Lock-Free Parallel Perceptron for Graph-based Dependency Parsing cs.CL · 2017 · author #2
A New Recurrent Neural CRF for Learning Non-linear Edge Features cs.CL · 2016 · author #1
Towards Easier and Faster Sequence Labeling for Natural Language Processing: A Search-based Probabilistic Online Learning Framework (SAPO) cs.LG · 2015 · author #2

Mentions

2402.17764 #1 · arxiv_oai · confidence 0.70 Shuming Ma
2310.11453 #2 · arxiv_oai · confidence 0.70 Shuming Ma
2302.14045 #6 · arxiv_oai · confidence 0.70 Shuming Ma

Frequent Coauthors

Xu Sun 25 shared papers
Xuancheng Ren 9 shared papers
Junyang Lin 8 shared papers
Furu Wei 7 shared papers
Houfeng Wang 6 shared papers
Qi Su 6 shared papers
Li Dong 5 shared papers
Pengcheng Yang 5 shared papers
Shaohan Huang 5 shared papers
Yi Zhang 5 shared papers
Wei Li 4 shared papers
Lei Cui 3 shared papers
Wenhui Wang 3 shared papers
Wenjie Li 3 shared papers
Bingzhen Wei 2 shared papers
Hongyu Wang 2 shared papers
Jilong Xue 2 shared papers
Jingjing Xu 2 shared papers
Lingxiao Ma 2 shared papers
Ruiping Wang 2 shared papers