I started about three weeks ago with my (heterogeneous) third-graders. I've found that some students are liking it more than others (though in general they all like it), but this doesn't seem to be determined by ability level. The two students that love it the most are on opposite ends of the spectrum.
I do agree with your plan to go heterogeneous, for the exact reasons that SReevesTX cites. Grouping partners with students of other ability levels has worked great for me. I don't put my highest with my lowest, but rather my highest with my exact middle, my second highest with the student below that middle, and so on (obviously ranking them this way is an oversimplification, but it works fine for determining partners).
Finally, re: SReevesTX, I started out with 1s and 2s in a similar way (1s high and 2s low), but don't actually use it much. I use Switch instead. If I give a difficult topic, the 1s are more likely to start anyway, and if the 2 starts instead, all the better! Switch makes sure that no matter the difficulty of your discussion topic, all the students are talking.