Why custom CRMs are harder than off-the-shelf
A Salesforce or HubSpot deployment, even a heavily customised one, has a known schema. The objects are documented, the fields follow conventions, and any analytics tool that supports the platform can autoload metadata and start answering questions immediately. A custom CRM has none of that. The schema is whatever your developer chose three years ago, the field naming is whatever made sense at the time ("status_v2", "is_actv", "remrks"), and the only documentation is the original developer's memory, which has moved to a different company.
The second hard problem is conventions. Off-the-shelf CRMs enforce sensible defaults: timestamps in UTC, soft deletes, consistent foreign key naming, audit columns. Custom CRMs inherit whatever the original developer believed was good enough at the time, which means timezone is local time unstamped, deletes are sometimes hard and sometimes soft depending on the table, and the link from "deal" to "company" is via a join table whose name nobody remembers.
The third problem is API surface. Off-the-shelf CRMs publish a REST or GraphQL API with documented endpoints. Most custom CRMs were built without an API in mind - the application is server-rendered HTML directly against the database, and the only API surface is whatever was added later for a mobile app or a Zapier integration, which usually covers 20% of the data.
The four stack patterns we see in Indian SMBs
| PHP / MySQL | Laravel / MySQL | .NET / SQL Server | Python / PostgreSQL | |
|---|---|---|---|---|
| Era | 2010 - 2018 | 2018 onwards | BFSI, engineering | Tech-led, newer builds |
| Connection method | Read-only MySQL user | Read-only MySQL user | SQL Server role grant | Read-only Postgres role |
| Auth model | App tables in same DB | Eloquent users table | AD-integrated common | Django auth or custom |
| Common gotchas | No conventions, cryptic naming | Soft-deletes, IST vs UTC drift | Stored proc business logic | JSONB columns hide schema |
| Schema cleanliness | Variable | Snake_case, predictable | Usually well-normalised | Usually well-modelled |
Connection options ranked by what we recommend
Three honest options for getting the AI layer to your CRM data, ranked by what works best in practice for Indian SMBs.
- Read-only DB user (preferred). Create a MySQL/Postgres/SQL Server user with SELECT permission on the relevant schema, optionally restricted to specific tables and columns. The AI connects directly. Fastest, most performant, most auditable. About 80% of Indian SMB custom CRM deployments end up here.
- API endpoint. If your CRM has a REST or GraphQL API and the data you care about is exposed there, the AI can read through it. Slower than direct DB, rate-limited by the API itself, and sometimes incomplete - custom CRM APIs rarely cover 100% of the schema. Useful when DB access is genuinely off the table.
- ETL to a staging warehouse. The AI reads from a Postgres or BigQuery staging copy that you populate via a nightly or hourly ETL from the CRM. Adds latency and another moving part, but useful when the CRM database cannot tolerate analytics reads (rare) or when you already have a warehouse you want to consolidate into.
See our full connector list for the supported databases and APIs across all three patterns.
Schema discovery and vocabulary mapping
The first week of any custom-CRM onboarding is schema discovery. The AI connects with read-only access, enumerates tables, samples a few rows from each, and produces a working map of what looks like what. Tables with names like "leads", "deals", "customers", "invoices" are self-explanatory. Tables with names like "tbl_act_v3" or "ph_data" need someone from your team to label them.
The second job in week one is vocabulary mapping. Your sales head says "active deals". The CRM has a column called "status_id" with integer values 1 through 8, and the team knows that 2, 3, and 4 mean active. KolossusAI captures this mapping once, with a sentence in plain English from the team, and the AI uses it forever. You do this for the 10 to 30 phrases your team uses regularly, and after that the system understands them natively.
The output of week one is a vocabulary file the team can read and edit, plus a schema graph that shows how tables link. This artefact is also useful for your own team documentation - several customers have used it as the first real CRM data dictionary their organisation ever had.
Security considerations for a read-only role
The read-only DB role is the most controlled option. It stacks several independent safety controls so leaked or misused credentials cannot cause damage.
- SELECT only. No INSERT, UPDATE, DELETE, or schema-modifying privileges on the role. The AI cannot change your data even if it tried.
- Table and column scoping. Sensitive tables (passwords, payment tokens, internal compensation) are excluded from the grant. Most modern databases also support column-level grants so individual PII columns can be hidden inside otherwise accessible tables.
- Network isolation. On-premise deployment keeps the DB unexposed to the internet. IP-allowlist deployment limits external reach to the AI's source range. SSH tunnel deployments work when the DB sits behind a jump host.
- Two-layer audit log. KolossusAI's own query log plus the database's general/slow query log give you two independent audit sources that should agree. A useful integrity check during the first audit.
Live questions vs scheduled reports
For most custom-CRM deployments, live question answering is the unlock. Your sales head types "show me deals over 1 crore stuck in negotiation for more than 30 days" and the answer comes back in seconds, against the live data the CRM is using right now. No exports, no overnight refresh, no "the dashboard hasn't loaded today's data yet" frustration.
Scheduled reports still have a place. Daily morning summary to the sales head's WhatsApp ("yesterday's new leads, deals closed, deals slipped"), weekly leadership pack, monthly board cuts. KolossusAI generates these on a schedule from saved questions, so the same plain-English query you asked once becomes a recurring report without building a dashboard.
Where dashboards still help is for frontline operational views (call centre live KPI wall, sales pipeline kanban) where the value is from glancing at the same chart repeatedly. For these we recommend a small Metabase or Power BI alongside, querying the same DB the AI reads.
The KolossusAI custom-CRM onboarding pattern
Three weeks from kickoff to a finance and sales team using the system daily, with a free 14-day POC covering most of weeks one and two.
- Week one: connect and discover. Read-only DB role created on your side, secure connector live on our side, schema discovery completed, vocabulary file drafted with your team in two short calls. By Friday, your sales head can ask three real questions and get correct answers.
- Week two: refine and pilot. Vocabulary refined with the broader team, edge cases mapped (your weird 'stage_secondary' field that means something different on Wednesdays), saved recurring reports configured, WhatsApp / email notifications set up. By Friday, four to six users are using the system daily.
- Week three: roll out. Full team onboarded, audit trail validated, first month of live questions logged for review. You decide whether to extend to write-back use cases (the AI updating CRM fields based on conversation outcomes) in a later phase.
See AI Analytics for Custom CRMs for the full onboarding pattern.