In batch Q-learning you only have historical data, with no possibility to adquire new data following a given policy. On the contrary, in growing batch Q-learning, the algoritm is almost equal, with the difference that in some iterations you use intermediate policies to acquire more data, thus growing the batch of data with new data (which incorporate exploration).
So, if you only have historical data, it is not possible to grow the batch with new data. I.e, in your case is not possible to implemente growing batch Q-learning.
You can read a detailed explanation in chapter 2 of the book: Wiering, Marco, y Martijn van Otterlo, eds. Reinforcement Learning: State-of-the-Art. 2012.ª ed. Springer, 2012. Link to the chapter