1. You selected only 10 projects per field from a database of 1,689, prioritizing “longer and more detailed submissions for richness of information.” This introduces severe selection bias toward projects that are already more articulate about relevance. How can you claim to study “relevance expressions in vivo” when your sampling explicitly excludes shorter, potentially less relevance-articulate projects? This undermines any generalizable claim about how SSH fields actually express relevance.
2. Table 1 presents methodological keywords (e.g., History: “Context, Event, Description”; Psychology: “Effect, Control group, Experiment”) as evidence that your field selection aligns with the nomothetic/idiographic distinction. But these terms were derived from the same project texts you later analyze for relevance expressions. This creates circular reasoning: you use the texts to confirm the distinction, then use the distinction to explain patterns in the texts. What independent validation do you have for these disciplinary alignments?
3. Table 2 introduces a new typology (soft vs. hard relevance) that appears nowhere in your methods section. You abandon the nomothetic/idiographic distinction exactly when the results become ambiguous (Political Science and Linguistics occupying “grey zones”). This reads as ad hoc theory-building to rescue an overextended distinction. On what basis were these five dimensions derived, and why weren’t they pre-registered or pilot-tested?
4. Forty projects across four diverse fields (with sub-field variations within each) is insufficient to support claims about disciplinary “tendencies” or to populate a 2×5 typology. Many cells in Table 2 contain only 2–3 project examples. Your own Table 1’s keyword clusters come from ~12,000 words per field, that’s roughly 1,200 words per project. Is it methodologically sound to infer disciplinary “elective affinities” from such thin material, especially when proposal summaries are known to be rhetorically strategic rather than epistemically representative?