Quickstart with vLLM
Introduction
In this tutorial we will learn how to use CodeTransEngine to translate source code. For this purpose, we will use a C++ example from the CodeNet dataset and translate it to Python using the ToCT algorithm.
Prerequisites
We assume you have configured the CodeTransEngine and vLLM server. We will use the examples/quickstart.example.yaml
configuration located at the root folder of the repository, please refer to the Configuration guide and set-up Python Executor Container.
Input Code
This is the C++ code that we will translate to Python.
#include <bits/stdc++.h>using namespace std;
int main (){ int L; cin>>L; if(L<1200){ cout<<"ABC"<<endl; } else if(L<2800){ cout<<"ARC"<<endl; } else{ cout<<"AGC"<<endl; }
}
The code is accompanied by fuzzy tests, which are pairs of input-output examples that help verify the correctness of the translation. This is possible because the programs in CodeNet receive input from the standard input and print the output to the standard output.
Fuzzy tests
stdin_input | expected_output |
---|---|
1199 | ABC |
1200 | ARC |
4208 | AGC |
Step 1: Launch vLLM
The first step is to launch the vLLM server in a terminal with the model you will use for translation. In this tutorial we will use the ise-uiuc/Magicoder-S-DS-6.7B model from HuggingFace. We need to launch the vLLM instance on the same IP address and port we configure in the Engine configuration.
vllm serve ise-uiuc/Magicoder-S-DS-6.7B --dtype auto --api-key token --port 8000 --host localhost
Depending on your GPU, you may want to set --max-model-len
to a lower value (e.g., --max-model-len 49024
) value to avoid running out of memory or try a smaller model.
Step 2: Launch CodeTransEngine
The next step is to launch the CodeTransEngine. We will use the quickstart.example.yaml
configuration file.
go run intertrans.go runserver examples/quickstart.example.yaml
Step 3: Build the request
Now that both the vLLM server and the CodeTransEngine are running, we can translate the code. We can create a new python script and start creating the translation request that will be sent to CodeTransEngine.
First we will import the protobuf objects necessary to create the request and utillity functions from the InterTrans Python client.
import intertrans.protos_pb2 as ptpbfrom intertrans.utils import submit_request
Next, we will use the code and fuzzy tests from the previous sections to create the translation request. CodeTransEngine allows either fuzzy test or unit tests for verification (but not both at this time, as it is a feature under development). We will use the fuzzy tests for this example.
Input code
We create a variable to hold the input code.
input_code = """#include <bits/stdc++.h>using namespace std;
int main (){ int L; cin>>L; if(L<1200){ cout<<"ABC"<<endl; } else if(L<2800){ cout<<"ARC"<<endl; } else{ cout<<"AGC"<<endl; }
}"""
Then, we create the translation request.
Request creation
batch_request = ptpb.BatchTranslationRequest()
request = ptpb.TranslationRequest()request.id = "1" # Unique identifier for the requestrequest.seed_language = "C++"request.target_language = "Python"request.seed_code = input_coderequest.model_name = "ise-uiuc/Magicoder-S-DS-6.7B"request.prompt_template_name = "prompt_codenet"request.regex_template_name = "temperature"
As you can see from the code above, we first create a BatchTranslationRequest
object and then create a TranslationRequest
object. CodeTransEngine is designed to maximize the throughput when translating multiple requests in batch. In this example we use a single request, but you can add more requests to the batch request. We set the seed language to C++ and the target language to Python. This variable is called seed as it is the lenguage of the original code that will be translated. We also set the model name to the Magicoder model we launched in the vLLM server. We also need to specify the prompt template used to build the prompts during the ToCT algorithm. Lastly, the regex_template_name
is the name of the regex used to match and extract source code from the inference output of the model.
Intermediate Languages
As explained in the Introduction, ToCT uses intermediate languages to improve the translation quality. We can specify the intermediate languages used in the translation by adding them to the used_languages
list in the request object. The more languages used the higher chance of finding a translation, however, it also increases the computational cost. We recommend starting with the following language list and tweak it according to your needs and experiments on what works best for your use case. See the Performance Tuning guide for more information.
For this tutorial, we will use the following languages as intermediates:
request.used_languages.append("Go")request.used_languages.append("Java")request.used_languages.append("Python")request.used_languages.append("C++")request.used_languages.append("JavaScript")request.used_languages.append("Rust")
Adding fuzzy tests
Lastly, we add the fuzzy tests to the request and we finish creating the request.
fuzzytest1 = ptpb.FuzzyTestCase()fuzzytest1.stdin_input = "1199"fuzzytest1.expected_output = "ABC"
fuzzytest2 = ptpb.FuzzyTestCase()fuzzytest2.stdin_input = "1200"fuzzytest2.expected_output = "ARC"
fuzzytest3 = ptpb.FuzzyTestCase()fuzzytest3.stdin_input = "4208"fuzzytest3.expected_output = "AGC"
request.test_suite.fuzzy_suite.append(fuzzytest1)request.test_suite.fuzzy_suite.append(fuzzytest2)request.test_suite.fuzzy_suite.append(fuzzytest3)
batch_request.translation_requests.append(request)
Step 4: Submit the request
Now that we have built the request, we can submit it to the CodeTransEngine. The submit_request
function will send the request to the server and return the results. This function is just a wrapper around gRPC calls to the server.
request_results = submit_request(batch_request, "localhost:50051")
Step 5: Check the results
The results
variable now holds an instance of BatchTranslationResponse that holds the TranslationResponse
of every translation request.
The requests are indexed according to the order added into the BatchTranslationRequest
. Therefore, we can access the results for our translation as follows:
translation_result = request_results.translation_responses[0]
This will give us access to the data structure for the ToCT which includes the prompts, intermediate translations, execution times, final translation, verification results, and more.
We can traverse these results by accessing the translation_results.paths
list. Each element in the list is a ResponseTranslationPath object that holds the data for a specific path in the ToCT. This is very powerful as it allows us to extract information about the translation process to use in further analysis, such as running evaluation metrics on the verification results (e.g. Computational Accuracy, CodeBLEU) or for example use the results from the verification as input for a next iteration of translations (e.g. Iterative translation with compiler feedback).
To keep this tutorial simple, we will utilize the get_translation
function that returns the translated code if the translation was found, or None
if the translation was not found. A translation is found if all the verification tests pass.
python_translation = get_translation(translation_result)
Finally, we can print the translation to the console.
print(python_translation)
Translated code
The translated code will look like this:
L = int(input())if L < 1200: print("ABC")elif L < 2800: print("ARC")else: print("AGC")
Conclusion
As you can see, most of the work is on creating the requests, while InterTrans saves the effort of running inference, executing the code and validating it. If you are translating a dataset, you can create a reusable function that creates the batch request and submits it to the server. This way you can easily translate multiple code snippets in a single run.
You can get the full code for this tutorial here: