Distribution of word counts in response fields

We will focus on word count of response field because the data we are interested in lies there.

There’s a lot of variation in response word count. The response word count has a distribution with a tail is on the left side meaning that most of the values are accumulated on the left. So this distribution has a negative skew.

Distribution of word count of responses. A zoom to response word counts between 0 and 100.

Figure 11: Distribution of word count of responses. A zoom to response word counts between 0 and 100.

If we look at the distribution of word count of responses by specific sectors which have relatively higher sample size:

Distribution of word count of responses in "VBG", "Poblacion" and "Tecnico" sectors

Figure 12: Distribution of word count of responses in “VBG”, “Poblacion” and “Tecnico” sectors

Logistic regression

We use a logistic regression model to estimate the probability which average word counts belong to what sectors.

term estimate std.error statistic p.value
(Intercept) 12.534884 10.18453 1.23077715 2.186480e-01
Acceso_a_educación 64.848678 12.83833 5.05117806 5.073289e-07
Agua, saneamiento e higiene 24.361668 16.04753 1.51809478 1.292546e-01
Alojamiento Temporal 33.840116 19.55721 1.73031411 8.383191e-02
Apoyo Educacional a Comunidades Receptoras 6.465116 48.30946 0.13382714 8.935618e-01
Asistencia técnica para educación 6.465116 67.55651 0.09569938 9.237754e-01
Asistencia técnica para el sector laboral 30.083298 13.59479 2.21285456 2.709580e-02
Asistencia técnica para gestion de la informacion y coordinacion 101.255240 12.60112 8.03541460 2.218784e-15
Asistencia técnica para protección 20.504332 13.82674 1.48294793 1.383515e-01
Asistencia técnica para protección de la infancia 31.265116 31.55560 0.99079441 3.219862e-01
Asistencia técnica para protección social 1.465116 67.55651 0.02168727 9.827010e-01
Asistencia técnica para protección/gestión de fronteras 46.768147 15.45577 3.02593399 2.531581e-03
Asistencia técnica para VBG-SSR 56.374207 22.56532 2.49826801 1.261348e-02
Cohesión_social 42.384035 12.80613 3.30966736 9.618860e-04
Manejo de la información para socios y análisis de las necesidades 127.798450 14.48858 8.82063567 3.946044e-18
Manejo de la información y entrega directa de la información a la población 22.072959 11.20661 1.96963743 4.911023e-02
Medios de vida y formación técnico-profesional 40.275243 10.85537 3.71016822 2.166249e-04
Necesidades básicas/Otro 5.545116 16.79675 0.33013025 7.413593e-01
Protección_LGBTI 73.965116 20.55012 3.59925528 3.321721e-04
Protección_VBG 42.465116 11.72662 3.62125723 3.054375e-04
Salud 198.631783 29.10471 6.82472950 1.396240e-11
Trata_y_tráfico 20.865116 31.55560 0.66121745 5.086001e-01