This repo releasing the code and benchmark datasets for paper "Token Alignment via Character Matching for Subword Completion" in ACL Findings 2024. In our paper, we noticed LLMs usually generate ...
Abstract: Generative artificial intelligence (GenAI), specifically, Large Language Models (LLMs), have shown tremendous potential in automating several tasks and improving human productivity. Recent ...
Thanks to AWQ, TinyChat can deliver more efficient responses with LLM/VLM chatbots through 4-bit inference. TinyChat on RTX 4090 (3.4x faster than FP16): TinyChat on Jetson Orin (3.2x faster than FP16 ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results