Hello!
Thank you so much for developing and releasing this model to the public. As a native Arabic speaker, I highly appreciate your efforts in enriching our beautiful language.
I have the following question related to the training process:
As per my understanding, the first step is continued pretending of Llama2 on Arabic data in a self supervised manner.
My question is how big is the data used in this step?
Thanks in advance.
Hello!
Thank you so much for developing and releasing this model to the public. As a native Arabic speaker, I highly appreciate your efforts in enriching our beautiful language.
I have the following question related to the training process:
As per my understanding, the first step is continued pretending of Llama2 on Arabic data in a self supervised manner.
My question is how big is the data used in this step?
Thanks in advance.