How to Define the Number and Types of Clustering (Part 1): Building Segmentation Hypotheses with Business Logic | Pixel Lab

How to Decide the Number of Clusters

The number of clusters should not be decided by simply saying, “I want 6 groups.”
Instead, we should first ask:

Can the resulting groups help me make decisions?

So:

If K=6 but only 3 groups can be clearly explained, then K=6 is not a good choice.
If K=4 and every group is clear, strategic, and measurable with KPIs, then K=4 may be better.
If K=5 can clearly identify a growth-potential user group, then K=5 has product value.

In short:

The product question determines the direction of clustering; the features determine what the model can see; the K value determines the level of segmentation detail; and UX / business interpretation determines whether clustering is truly useful.

How to Define the Number of Clusters, or the K Value

“First understand the approximate business-defined scope, then use data to identify reasonable candidate values, and finally use business judgment to decide the final number, naming, and product strategy.”

The K value should not be decided only by intuition, nor only by mathematical scores. We should first use business understanding to define an approximate scope, then use data to identify reasonable candidates, and finally evaluate which K is most useful from a UX / business perspective, such as through business interpretability.

K-means requires us to define K = the number of clusters in advance. It divides an unlabeled dataset into K clusters and assigns data points to the nearest centroid based on distance. One limitation of K-means is that it is not always easy to identify the correct K value.

1. Use the Business Question to Define the Approximate Scope

Do not start by asking:

How many clusters should we create?

Instead, ask:

What problem do I want to solve with clustering?

For example:

Which users are most likely to increase their usage through product design, content recommendation, or promotional incentives?

This question is not simply about finding “low-usage users.” It is about finding:

User groups whose usage has not yet peaked, but who still show clear potential for growth.

How to Set the Direction

If the goal is to “increase usage,” the segmentation should focus on:

Activity level, return visits, interaction depth, and cross-feature usage.

2. Who: Define the Target Users

Confirm which group of users you want to analyze.

For example:

Target Users	Description
All app users	Suitable for overall segmentation
Active users in the last 30 days	Suitable for analyzing usage growth
Low-activity users	Suitable for re-engagement analysis
Jetso / Reward users	Suitable for analyzing deal-oriented behavior
Community interaction users	Suitable for analyzing UGC / community growth

3. What: Define Behavioral Signals

Translate business understanding into observable data signals.

For example:

Business Concept	Measurable Metrics
Activity level	App opens, session count, active days
Content interest	Page views, article views, category views
Deal interest	Jetso clicks, Reward clicks, redemption
Interaction depth	Likes, comments, shares, follows, saves
Search demand	Search count, AI search usage
Churn risk	Days since last visit, inactive days
Conversion behavior	Registration, coupon claim, mission completion

4. How Often: Define High / Medium / Low Thresholds

Use simple rules to create an initial classification.

For example:

Level	Initial Definition
High activity	Uses the app 5+ days per week / high session count
Medium activity	Uses the app 2–4 days per week
Low activity	Uses the app 0–1 day per week
Churn risk	No return visit for 14 or 30 days
High deal interest	Jetso / Reward clicks above average
High content interest	Article views above average

These thresholds do not need to be very precise at the beginning. They can be based on experience or percentiles, such as top 25%, middle 50%, and bottom 25%.

5. So What: Define Business Value

Each segment should be able to answer:

What can I do after identifying this segment?

For example:

Initial User Type	Action Value
Highly active loyal users	Maintain loyalty, promote member missions, improve retention
Medium-active potential users	Most suitable for increasing usage
Deal-oriented users	Use offers to drive content and community usage
Content browsing users	Use recommendations, AI Search, and save features to increase return visits
Low-activity / churn-risk users	Use re-engagement campaigns to bring them back

If a segment does not have a clear action, it is usually not worth keeping as an independent business segment.

Target Groups Worth Prioritizing

If the business question is “increase usage,” the most valuable groups to focus on are:

1. Medium-active Potential Users

They already have a usage habit, but they have not yet developed high-frequency behavior.

For example, they may use the app once or twice a week, but not every day.

Strategy direction:
Push personalized content, mission systems, daily check-ins, save reminders, and related article recommendations.

2. Deal-oriented Users

They have clear motivation toward Jetso, rewards, and coupons, but they may only enter the app when there are offers.

Strategy direction:
Use offer pages to guide them toward articles, lifestyle content, community sharing, and member missions to increase cross-feature usage.

3. Content Browsing Users

They are willing to consume content, but their interaction depth is still low.

Strategy direction:
Strengthen related content, AI Search, topic following, author / topic tracking, and personalized homepages.

Groups Not Recommended as the First Priority

Highly Active Loyal Users

They already have high usage. The main goal for this group should be retention, not usage growth.

Extremely Low-activity Users

They may no longer have a clear need, and the cost of reactivating them may be high. Reactivation can still be attempted, but they may not be the most effective target group in the first stage.

Conclusion

Business logic definition is a simple and intuitive method, and it is very suitable as the starting point for clustering. It can first divide users into several possible types based on business goals, product experience, and user behavior understanding, helping the team quickly establish an analysis direction.

However, this type of classification is essentially a business hypothesis. It does not mean that the data will naturally form the same groups. The core of clustering is to automatically discover hidden natural groupings based on similarity between data points. Clustering is a form of unsupervised learning that discovers natural groupings in unlabeled data.

Therefore, a more reasonable process is:

First use business logic to define the approximate direction, then use clustering methods to validate it.

For example, we can first hypothesize 5 user types from a business perspective, then use K-means, the Elbow Method, and the Silhouette Score to check whether the data supports these classifications. If the data shows that K=4 is more reasonable, similar groups should be considered for merging. If K=5 has a slightly lower score but each group has clear characteristics, sufficient user volume, and different strategic value, then K=5 can still be kept.

The final number and types of clusters should not be decided only by mathematical scores or only by business intuition. They should balance three things:

Data validity, business interpretability, and product actionability.

How Clustering Supports UX and Product Design: From User Segmentation to Product Strategy

How to Define the Number and Types of Clustering (2): Using Data Methods to Find a Reasonable Number of Clusters

How to Define the Number and Types of Clustering (1) - Building Segmentation Hypotheses with Business Logic

How to Decide the Number of Clusters

How to Define the Number of Clusters, or the K Value

“First understand the approximate business-defined scope, then use data to identify reasonable candidate values, and finally use business judgment to decide the final number, naming, and product strategy.”

1. Use the Business Question to Define the Approximate Scope

How to Set the Direction

2. Who: Define the Target Users

3. What: Define Behavioral Signals

4. How Often: Define High / Medium / Low Thresholds

5. So What: Define Business Value

Target Groups Worth Prioritizing

1. Medium-active Potential Users

2. Deal-oriented Users

3. Content Browsing Users

Groups Not Recommended as the First Priority

Highly Active Loyal Users

Extremely Low-activity Users

Conclusion

Related Article