Models MIT llama.cpp A fast, portable inference stack for running open-weight language models across local machines, servers, and edge devices. inferencelocal-aiopen-weights