All checks were successful
Deploy to Production / deploy (push) Successful in 1m47s
- Changed database credentials in .env for improved security - Added detailed implementation plan for content statistics feature (CONTENT_STATISTICS_PLAN.md) - Created summary documentation for content statistics feature (CONTENT_STATISTICS_SUMMARY.md) - Removed legacy MySQL root password and simplified database config variables - Updated database name to use production naming convention (voxblog_prod)
322 lines
11 KiB
Markdown
322 lines
11 KiB
Markdown
# Content Statistics Feature - Implementation Plan
|
|
|
|
## Overview
|
|
Add comprehensive statistics display for generated articles in the StepGenerate component, showing metrics like word count, paragraph count, token count, reading time, and more.
|
|
|
|
## Current State Analysis
|
|
|
|
### Existing Code Structure
|
|
- **Component**: `apps/admin/src/components/steps/StepGenerate.tsx`
|
|
- **Current Stats**: Only shows `tokenCount` during streaming (line 236, 249)
|
|
- **Content Display**: Two sections
|
|
1. **Live Generation** (lines 256-284) - Shows streaming content
|
|
2. **Generated Draft** (lines 288-336) - Shows final content
|
|
- **Data Available**:
|
|
- `generatedDraft` - HTML string of generated content
|
|
- `tokenCount` - Number of tokens generated (streaming only)
|
|
- `streamingContent` - Real-time content during generation
|
|
- `imagePlaceholders` - Array of image placeholder strings
|
|
- `generationSources` - Array of web sources used
|
|
|
|
### Current Display Locations
|
|
1. **During streaming** (line 248-250): Shows token count in caption
|
|
2. **After generation** (line 291-301): Shows sources count
|
|
3. **After generation** (line 303-314): Shows image placeholders count
|
|
|
|
## Proposed Statistics
|
|
|
|
### Core Metrics
|
|
1. **Word Count** - Total words in article (excluding HTML tags)
|
|
2. **Character Count** - Total characters (with/without spaces)
|
|
3. **Paragraph Count** - Number of `<p>` tags
|
|
4. **Heading Count** - Number of `<h2>`, `<h3>`, etc.
|
|
5. **List Item Count** - Number of `<li>` tags
|
|
6. **Token Count** - AI tokens generated (already available)
|
|
7. **Image Placeholder Count** - Already shown, enhance display
|
|
8. **Reading Time** - Estimated minutes (avg 200-250 words/min)
|
|
|
|
### Advanced Metrics (Optional)
|
|
9. **Sentence Count** - Approximate sentences
|
|
10. **Average Words per Paragraph** - Content density
|
|
11. **Average Words per Sentence** - Readability indicator
|
|
12. **Link Count** - Number of `<a>` tags in content
|
|
13. **Generation Time** - Time taken to generate (if available)
|
|
|
|
## Implementation Plan
|
|
|
|
### Phase 1: Create Statistics Utility Module ✅
|
|
**File**: `apps/admin/src/utils/contentStats.ts` (new file)
|
|
|
|
```typescript
|
|
export interface ContentStatistics {
|
|
wordCount: number;
|
|
characterCount: number;
|
|
characterCountNoSpaces: number;
|
|
paragraphCount: number;
|
|
headingCount: number;
|
|
listItemCount: number;
|
|
sentenceCount: number;
|
|
linkCount: number;
|
|
readingTimeMinutes: number;
|
|
avgWordsPerParagraph: number;
|
|
avgWordsPerSentence: number;
|
|
}
|
|
|
|
export function calculateContentStats(htmlContent: string): ContentStatistics {
|
|
// Implementation details below
|
|
}
|
|
```
|
|
|
|
**Functions to implement**:
|
|
- `stripHtmlTags(html: string): string` - Remove all HTML tags
|
|
- `countWords(text: string): number` - Count words
|
|
- `countParagraphs(html: string): number` - Count `<p>` tags
|
|
- `countHeadings(html: string): number` - Count `<h1>` to `<h6>` tags
|
|
- `countListItems(html: string): number` - Count `<li>` tags
|
|
- `countSentences(text: string): number` - Approximate sentence count
|
|
- `countLinks(html: string): number` - Count `<a>` tags
|
|
- `calculateReadingTime(wordCount: number): number` - Estimate reading time
|
|
- `calculateContentStats(htmlContent: string): ContentStatistics` - Main function
|
|
|
|
### Phase 2: Create Statistics Display Component ✅
|
|
**File**: `apps/admin/src/components/ContentStatistics.tsx` (new file)
|
|
|
|
```typescript
|
|
interface ContentStatisticsProps {
|
|
htmlContent: string;
|
|
tokenCount?: number;
|
|
imagePlaceholderCount?: number;
|
|
generationTimeMs?: number;
|
|
variant?: 'compact' | 'detailed';
|
|
}
|
|
|
|
export default function ContentStatistics({
|
|
htmlContent,
|
|
tokenCount,
|
|
imagePlaceholderCount,
|
|
generationTimeMs,
|
|
variant = 'detailed'
|
|
}: ContentStatisticsProps) {
|
|
// Calculate stats using utility
|
|
// Display in clean, organized format
|
|
}
|
|
```
|
|
|
|
**Display Design**:
|
|
- Use Material-UI `Paper` or `Alert` component
|
|
- Grid layout for metrics (2-3 columns on desktop, 1-2 on mobile)
|
|
- Icons for each metric (optional)
|
|
- Color-coded sections:
|
|
- **Primary metrics** (word count, reading time) - prominent
|
|
- **Structure metrics** (paragraphs, headings) - secondary
|
|
- **Technical metrics** (tokens, generation time) - tertiary
|
|
|
|
### Phase 3: Integrate into StepGenerate ✅
|
|
**File**: `apps/admin/src/components/steps/StepGenerate.tsx`
|
|
|
|
**Changes needed**:
|
|
|
|
1. **Import new components**:
|
|
```typescript
|
|
import ContentStatistics from '../ContentStatistics';
|
|
import { calculateContentStats } from '../../utils/contentStats';
|
|
```
|
|
|
|
2. **Add statistics to "Live Generation" section** (after line 280):
|
|
```typescript
|
|
{/* Live stats during streaming */}
|
|
<ContentStatistics
|
|
htmlContent={streamingContent}
|
|
tokenCount={tokenCount}
|
|
variant="compact"
|
|
/>
|
|
```
|
|
|
|
3. **Add statistics to "Generated Draft" section** (after line 315, before content preview):
|
|
```typescript
|
|
{/* Final statistics */}
|
|
<ContentStatistics
|
|
htmlContent={generatedDraft}
|
|
tokenCount={tokenCount}
|
|
imagePlaceholderCount={imagePlaceholders.length}
|
|
variant="detailed"
|
|
/>
|
|
```
|
|
|
|
4. **Optional: Add generation time tracking**:
|
|
```typescript
|
|
// Add state
|
|
const [generationStartTime, setGenerationStartTime] = useState<number>(0);
|
|
const [generationTimeMs, setGenerationTimeMs] = useState<number>(0);
|
|
|
|
// In onClick handler (line 169)
|
|
setGenerationStartTime(Date.now());
|
|
|
|
// In onDone callback (line 204)
|
|
setGenerationTimeMs(Date.now() - generationStartTime);
|
|
```
|
|
|
|
### Phase 4: Mobile Optimization ✅
|
|
**Ensure responsive design**:
|
|
- Stack metrics vertically on mobile (xs breakpoint)
|
|
- Use smaller font sizes on mobile
|
|
- Collapse less important metrics on mobile
|
|
- Use `variant="compact"` for live streaming on mobile
|
|
|
|
### Phase 5: Testing & Polish ✅
|
|
1. Test with various content lengths (short, medium, long articles)
|
|
2. Test with different HTML structures (headings, lists, links)
|
|
3. Verify mobile responsiveness
|
|
4. Add loading states if needed
|
|
5. Add tooltips for metric explanations
|
|
|
|
## Code Structure
|
|
|
|
### File Organization
|
|
```
|
|
apps/admin/src/
|
|
├── components/
|
|
│ ├── ContentStatistics.tsx # New component
|
|
│ └── steps/
|
|
│ └── StepGenerate.tsx # Modified
|
|
└── utils/
|
|
└── contentStats.ts # New utility module
|
|
```
|
|
|
|
### Clean Code Principles
|
|
1. **Single Responsibility**: Each function does one thing
|
|
2. **Pure Functions**: Stats calculation has no side effects
|
|
3. **Reusable**: Stats component can be used elsewhere
|
|
4. **Type Safe**: Full TypeScript types
|
|
5. **Testable**: Utility functions are easy to unit test
|
|
6. **Readable**: Clear naming and documentation
|
|
|
|
## Implementation Steps
|
|
|
|
### Step 1: Create Utility Module
|
|
- [ ] Create `apps/admin/src/utils/contentStats.ts`
|
|
- [ ] Implement HTML parsing functions
|
|
- [ ] Implement text analysis functions
|
|
- [ ] Implement main `calculateContentStats` function
|
|
- [ ] Add TypeScript interfaces
|
|
- [ ] Add JSDoc comments
|
|
|
|
### Step 2: Create Display Component
|
|
- [ ] Create `apps/admin/src/components/ContentStatistics.tsx`
|
|
- [ ] Design layout (grid/flex)
|
|
- [ ] Add responsive breakpoints
|
|
- [ ] Implement compact vs detailed variants
|
|
- [ ] Add icons (optional)
|
|
- [ ] Style with Material-UI theme
|
|
|
|
### Step 3: Integrate into StepGenerate
|
|
- [ ] Import new modules
|
|
- [ ] Add to streaming section (compact variant)
|
|
- [ ] Add to generated draft section (detailed variant)
|
|
- [ ] Optional: Add generation time tracking
|
|
- [ ] Test all scenarios
|
|
|
|
### Step 4: Test & Refine
|
|
- [ ] Test with real content
|
|
- [ ] Verify mobile layout
|
|
- [ ] Check performance (stats calculation should be fast)
|
|
- [ ] Add error handling for edge cases
|
|
- [ ] Update documentation
|
|
|
|
## Example Output
|
|
|
|
### Compact Variant (During Streaming)
|
|
```
|
|
📊 Live Stats: 342 words • 2 min read • 1,234 tokens • 8 paragraphs
|
|
```
|
|
|
|
### Detailed Variant (After Generation)
|
|
```
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ Content Statistics │
|
|
├─────────────────────────────────────────────────────┤
|
|
│ 📝 Words: 1,234 ⏱️ Reading Time: 5 min │
|
|
│ 🔤 Characters: 6,789 📄 Paragraphs: 15 │
|
|
│ 📑 Headings: 8 📋 List Items: 12 │
|
|
│ 🤖 Tokens: 1,567 🖼️ Images: 3 │
|
|
│ 🔗 Links: 5 ⚡ Generated in: 12.3s │
|
|
└─────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Benefits
|
|
|
|
1. **User Insight**: Writers see content metrics at a glance
|
|
2. **Quality Control**: Identify too-short or too-long content
|
|
3. **SEO Awareness**: Word count and reading time matter for SEO
|
|
4. **Content Planning**: Helps plan article structure
|
|
5. **Performance Tracking**: Token usage helps manage API costs
|
|
6. **Professional Feel**: Adds polish to the editor
|
|
|
|
## Technical Considerations
|
|
|
|
### Performance
|
|
- Stats calculation should be < 50ms for typical articles
|
|
- Use memoization if needed (useMemo)
|
|
- Don't recalculate on every render
|
|
|
|
### Edge Cases
|
|
- Empty content
|
|
- Content with only HTML tags
|
|
- Very long content (10k+ words)
|
|
- Malformed HTML
|
|
- Content with inline styles/scripts
|
|
|
|
### Accessibility
|
|
- Use semantic HTML
|
|
- Add ARIA labels if needed
|
|
- Ensure color contrast
|
|
- Support keyboard navigation
|
|
|
|
## Future Enhancements
|
|
|
|
1. **Export Stats**: Download stats as JSON/CSV
|
|
2. **Historical Tracking**: Compare stats across generations
|
|
3. **Target Metrics**: Set word count goals
|
|
4. **SEO Score**: Basic SEO analysis
|
|
5. **Readability Score**: Flesch-Kincaid or similar
|
|
6. **Keyword Density**: Track keyword usage
|
|
7. **Content Comparison**: Compare before/after edits
|
|
|
|
## Success Criteria
|
|
|
|
- ✅ Stats display correctly for all content types
|
|
- ✅ Mobile-responsive layout
|
|
- ✅ Fast calculation (< 50ms)
|
|
- ✅ Clean, maintainable code
|
|
- ✅ No performance degradation
|
|
- ✅ Helpful for content creators
|
|
|
|
---
|
|
|
|
**Status**: ✅ IMPLEMENTED - All phases complete!
|
|
**Actual Time**: ~30 minutes
|
|
**Priority**: Medium
|
|
**Complexity**: Low-Medium
|
|
|
|
## Implementation Summary
|
|
|
|
### Files Created
|
|
1. ✅ `apps/admin/src/utils/contentStats.ts` - Statistics calculation utility
|
|
2. ✅ `apps/admin/src/components/ContentStatistics.tsx` - Display component
|
|
|
|
### Files Modified
|
|
1. ✅ `apps/admin/src/components/steps/StepGenerate.tsx` - Integrated statistics
|
|
|
|
### Features Implemented
|
|
- ✅ Word count, character count, reading time
|
|
- ✅ Paragraph, heading, list item counts
|
|
- ✅ Sentence count and averages
|
|
- ✅ Token count display
|
|
- ✅ Generation time tracking
|
|
- ✅ Image placeholder count
|
|
- ✅ Link count
|
|
- ✅ Compact variant for live streaming
|
|
- ✅ Detailed variant for final draft
|
|
- ✅ Mobile-responsive grid layout
|
|
- ✅ Performance optimized with useMemo
|