1. Organization: JD pointed out that discussion and conclusion should be made separate sections. Discussion will bring back the literature and situate the findings in the current research in the bigger research finding context. Conclusion is a summary of the answers to the research questions and also answer the "so what" question.
2. Theoretical issue: construct needs to be defined, from both literature and practical reasons. Think in terms of both theoretical conceptualization and the intended use of the construct. Maybe it makes more sense to justify the choice of tasks based on the intended use of the assessment and its implication.
3. Methodological issue:
(1) About using Multitrait-Multimethod confirmatory factor analysis (MTMM). JD commented that in language testing literature, methods almost never showed up as a separate effect in MTMM analysis, except for
Bachman, L. F., & Palmer, A. S. (1982). The Construct Validation of Some Components of Communicative Proficiency. TESOL Quarterly, 16(4), 449-465.
Siwon's research also demonstrated clear method effect. However, it may be questioned that the method effect is probably due to the different rating conditions (spontaneous vs. post hoc) by the judges, rather than the difference between task types.(2) John cautioned about the interpretation of the significant result. He suggested to also report effect size, because significance is not equal to meaningfulness.
(3) It is also recommended that the vocabulary diversity should be calculated by D parameter. I made a simple tutorial and published it online: http://ourmedia.org/node/366283.
(4) Siwon was also recommended to derive participants' English proficiency level from the complexity measurement of their oral production.
(5) It would be nice to include narration as a different genre in the speaking tasks.
It's amazing how much I learned from Siwon's defense!