Federated Learning for Mobile Keyboard Prediction
read the original abstract
We train a recurrent neural network language model using a distributed, on-device learning framework called federated learning for the purpose of next-word prediction in a virtual keyboard for smartphones. Server-based training using stochastic gradient descent is compared with training on client devices using the Federated Averaging algorithm. The federated algorithm, which enables training on a higher-quality dataset for this use case, is shown to achieve better prediction recall. This work demonstrates the feasibility and benefit of training language models on client devices without exporting sensitive user data to servers. The federated learning environment gives users greater control over the use of their data and simplifies the task of incorporating privacy by default with distributed training and aggregation across a population of client devices.
This paper has not been read by Pith yet.
Forward citations
Cited by 22 Pith papers
-
When More Parameters Hurt: Foundation Model Priors Amplify Worst-Client Disparity Under Extreme Federated Heterogeneity
Foundation model priors amplify worst-client disparity under extreme federated heterogeneity, creating a fairness paradox where larger models perform worse for disadvantaged clients.
-
Unified Compression Algorithm for Distributed Nonconvex Optimization: Generalized to 1-Bit, Saturation, and Bounded Noise
A unified compression algorithm for distributed nonconvex optimization achieves O(1/sqrt(T)) convergence for locally-bounded compressors, matching centralized 1-bit methods, with an improved O(1/T^{2/3}) rate after on...
-
XFED: Non-Collusive Model Poisoning Attack Against Byzantine-Robust Federated Classifiers
XFED is the first aggregation-agnostic non-collusive model poisoning attack that bypasses eight state-of-the-art defenses on six benchmark datasets without attacker coordination.
-
Distributed Online Convex Optimization with Compressed Communication: Optimal Regret and Applications
Optimal regret bounds O(δ^{-1/2}√T) for convex and O(δ^{-1} log T) for strongly convex losses are achieved in distributed online convex optimization under compressed communication.
-
Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning
SABLE shows that semantics-aware natural triggers enable effective backdoor attacks in federated learning against multiple aggregation rules while preserving benign accuracy.
-
Simulating Word Suggestion Usage in Mobile Typing to Guide Intelligent Text Entry Design
WSTypist is a new RL-based simulation model that reproduces human-like word suggestion strategies, individual differences, and adaptation to design changes in mobile text entry.
-
Tighter Performance Theory of FedExProx
New analysis framework yields tighter linear convergence for FedExProx on non-strongly convex quadratics and PL functions, proving outperformance over GD once communication costs are counted.
-
Factor Augmented High-Dimensional SGD
Proposes Factor-Augmented SGD that runs on streaming high-dimensional data and supplies the first convergence analysis explicitly accounting for latent-factor estimation error.
-
Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity
Rescaled ASGD recovers convergence to the true global objective by rescaling worker stepsizes proportional to computation times, matching the known time lower bound in the leading term under non-convex smoothness and ...
-
Analytically Characterized Optimal Power Control for Signal-Level-Integrated Sensing, Computing and Communication in Federated Learning
An optimal convex-reformulated power control algorithm is derived for signal-level integrated sensing, computing and communication in AirComp-based federated learning under a joint target detection constraint.
-
FedACT: Concurrent Federated Intelligence across Heterogeneous Data Sources
FedACT schedules devices across concurrent FL jobs via alignment scoring and fairness to reduce average job completion time by up to 8.3x and raise accuracy by up to 44.5% versus baselines.
-
DeepFedNAS: Efficient Hardware-Aware Architecture Adaptation for Heterogeneous IoT Federations via Pareto-Guided Supernet Training
DeepFedNAS delivers up to 1.21% higher accuracy and 61x faster architecture search for federated learning on heterogeneous IoT by replacing random supernet sampling with Pareto-optimal elite architectures and using a ...
-
Prompt Estimation from Prototypes for Federated Prompt Tuning of Vision Transformers
PEP-FedPT achieves generalization and personalization in federated ViT prompt tuning via adaptive mixing of class-specific prompts weighted by global class prototypes and client priors, without per-client trainable pa...
-
Privacy Against Agnostic Inference Attacks in Vertical Federated Learning
Active party in VFL performs agnostic inference attacks via independent models on logistic regression and counters them with tunable distortion of passive-party parameters.
-
Adaptive Federated Optimization
Proposes federated adaptive optimizers (FedAdagrad, FedAdam, FedYogi) with convergence analysis for non-convex objectives under data heterogeneity and reports empirical gains over FedAvg.
-
FedEDAuth -- Federated Embedding Distribution Authentication for Counterfeit IC Detection
FedEDAuth filters malicious clients in federated learning for counterfeit IC detection by analyzing embedding distributions from a golden reference, achieving 100% detection of poisoned clients and 94.17% model accura...
-
HUOZIIME: An On-Device LLM-enhanced Input Method for Deep Personalization
HUOZIIME is an on-device LLM-powered input method with post-training on synthesized data and hierarchical memory that achieves efficient execution and memory-driven personalization.
-
REVERB-FL: Server-Side Adversarial and Reserve-Enhanced Federated Learning for Robust Audio Classification
REVERB-FL uses a server-side reserve set with retraining and adversarial training to reduce poisoning effects and speed convergence in federated audio classification under non-IID data.
-
LADSG: Label-Anonymized Distillation and Similar Gradient Substitution for Label Privacy in Vertical Federated Learning
LADSG is a unified defense framework that reduces success rates of passive, active, and direct label inference attacks in VFL by 30-60% via label anonymization, gradient substitution, and norm-based filtering.
-
BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning
BoBa uses data distribution inference and overlapping clustering with voting to detect backdoor attacks in non-IID federated learning, claiming attack success rates below 0.001.
-
Federated Learning by Utility-Constrained Stochastic Aggregation for Improving Rational Participation
FedUCA formalizes the server as an optimizer that uses utility-constrained stochastic aggregation to maximize client retention and global performance in heterogeneous federated learning.
-
Split and Aggregation Learning for Foundation Models Over Mobile Embodied AI Network (MEAN): A Comprehensive Survey
The paper surveys split and aggregation learning for foundation models in 6G networks to improve efficiency, resource use, and data privacy in distributed AI.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.