|FA2: Fast, accurate autoscaling for serving deep learning inference with SLA guarantees|
K Razavi, M Luthra, B Koldehofe, M Mühlhäuser, L Wang
2022 IEEE 28th Real-Time and Embedded Technology and Applications Symposium …, 2022
|Operator as a service: Stateful serverless complex event processing|
M Luthra, S Hennig, K Razavi, L Wang, B Koldehofe
2020 IEEE International Conference on Big Data (Big Data), 1964-1973, 2020
|Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems|
M Salmani, S Ghafouri, A Sanaee, K Razavi, M Mühlhäuser, J Doyle, ...
Proceedings of the 3rd Workshop on Machine Learning and Systems, 78-86, 2023
|Distributed DNN serving in the network data plane|
K Razavi, G Karlos, V Nigade, M Mühlhäuser, L Wang
Proceedings of the 5th International Workshop on P4 in Europe, 67-70, 2022
|IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency|
S Ghafouri, K Razavi, M Salmani, A Sanaee, T Lorido-Botran, L Wang, ...
arXiv preprint arXiv:2308.12871, 2023